More on Redundant Materialized Views

I wrote before on the subject of why you might find it beneficial to create materialized views that are apparantly completely redundant, and here are some more notes (and a syntatical curiosity, I think) on the subject. What I didn’t demonstrate in my previous writing was the way that the optimizer can use a summary table to quickly andefficiently execute queries with a “having” clause, nor how the DBMS_Advanced_Rewrite package can be even more efficient.

I present below a series of scripts to demonstrate the following:

  1. Use of a materialized view to allow a GROUP BY clause to be eliminated and a HAVING clause to be rewritten to a WHERE clause.
  2. Consequent use an index on a materialized view to optimize the original HAVING clause.
  3. Use of a HAVING clause in the materialized view to reduce the size of the result set for groups of particular interest.
  4. Use of partitioning in the materialized view to further optimize the original HAVING clause.
  5. Use of the DBMS_Advanced_Rewrite package to remove the need for materializing a result set, and allow the original GROUP BY/HAVING query to be rewritten against the original data set.

We start the demonstration by creating a sales table that has a primary key declared on the columns of (product_cd, location_cd, date_of_sale).

SQL*Plus: Release 10.2.0.2.0 - Production on Mon Aug 14 14:50:27 2006

Copyright (c) 1982, 2005, Oracle.  All Rights Reserved.

Enter password:

Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - Production
With the Partitioning, OLAP and Data Mining options

SQL> drop table sales
  2  /

Table dropped.

SQL>
SQL> create table sales
  2  (
  3  product_cd   not null,
  4  location_cd  not null,
  5  date_of_sale not null,
  6  sales_amt    not null
  7  )
  8  pctfree 0 compress nologging
  9  as
 10  select     mod(rownum,101),
 11             mod(rownum,5),
 12             date '2005-01-01' + mod(rownum-1,701),
 13             dbms_random.value(1,1000)
 14  from       dual
 15  connect by level <= 101*5*701
 16  /

Table created.

SQL>
SQL> alter table sales
  2  add constraint pk_sales
  3  primary key(product_cd,location_cd,date_of_sale)
  4  /

Table altered.

SQL>
SQL> exec dbms_stats.gather_table_stats(user,'sales')

PL/SQL procedure successfully completed.

We execute a query to find very high sales by product, location and day.

SQL> select   product_cd,
  2           location_cd,
  3           date_of_sale,
  4           sum(sales_amt) sum_sales_amt
  5  from     sales
  6  group by product_cd,
  7           location_cd,
  8           date_of_sale
  9  having   sum(sales_amt) > 999.9
 10  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2867331465

-------------------------------------------------------------------------------------
| Id  | Operation           | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |       | 88005 |  3007K|       |  2085   (4)| 00:00:26 |
|*  1 |  FILTER             |       |       |       |       |            |          |
|   2 |   HASH GROUP BY     |       | 88005 |  3007K|    20M|  2085   (4)| 00:00:26 |
|   3 |    TABLE ACCESS FULL| SALES |   351K|    11M|       |   363   (4)| 00:00:05 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(SUM("SALES_AMT")>999.9)

(Note: I edited the plan to remove dumb “more than 80 column” formatting issues)

So the execution plan requires a full table access, a hash group by (this being 10gR2, otherwise you’d expect a SORT (GROUP BY) there), and finally a filter. A total cost of 2,086 (your result may vary).

Let us assume that this is a very commonly executed query. How can we optimize it?

We create a materialized view that contains identical data to the sales table, but of course has the additional metadata associated with the MV’s defining select statement. Implicit to this metadata is the information that the data set is unique on the key columns of product_cd, location_cd and date_of_sale — information that the optimizer had access to through the definition of the primary key on the sales table but which it did not act on.

SQL> drop materialized view sales_mv
  2  /
drop materialized view sales_mv
*
ERROR at line 1:
ORA-12003: materialized view "DAVE"."SALES_MV" does not exist

SQL>
SQL> create materialized view sales_mv
  2  build immediate
  3  enable query rewrite
  4  as
  5  select   product_cd,
  6           location_cd,
  7           date_of_sale,
  8           sum(sales_amt) s_sales_amt
  9  from     sales
 10  group by product_cd,
 11           location_cd,
 12           date_of_sale
 13  /

Materialized view created.

SQL> create index sales_mv_sales_amt
  2  on sales_mv (s_sales_amt)
  3  /

Index created.

SQL>
SQL> begin
  2     dbms_stats.gather_table_stats(
  3        user,
  4        'sales_mv',
  5        method_opt => 'for all indexed columns size 254');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select   product_cd,
  2           location_cd,
  3           date_of_sale,
  4           sum(sales_amt) sum_sales_amt
  5  from     sales
  6  group by product_cd,
  7           location_cd,
  8           date_of_sale
  9  having   sum(sales_amt) > 999.9
 10  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2514656059

-------------------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |                    |    32 |  1664 |    36   (0)| 00:00:01 |
|   1 |  MAT_VIEW REWRITE ACCESS BY INDEX ROWID| SALES_MV           |    32 |  1664 |    36   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                     | SALES_MV_SALES_AMT |    32 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("SALES_MV"."S_SALES_AMT">999.9)

As you can see the query can now rewrite against the materialized view, and in addition the HAVING clause has been rewritten to a WHERE clause (see the predicate information line). Consequently the optimizer used an index range scan to satisfy the SUM(SALES_AMT) > 999.9 clause. The cost estimate of the rewritten query is 36 — only 1.7% of the 2,085 of the unrewritten query.

Let us suppose that business users are only interested in very high sales. We create a materialized view that has its own HAVING clause, to filter the result set to those key combinations with a sum of sales greater than or equal to 999 (note: not 999.9, as in the test query — we’re leaving some wiggle room here).

SQL> drop materialized view sales_mv
  2  /

Materialized view dropped.

SQL> drop materialized view sales_mv_hi_sales
  2  /
drop materialized view sales_mv_hi_sales
*
ERROR at line 1:
ORA-12003: materialized view "DAVE"."SALES_MV_HI_SALES" does not exist

SQL>
SQL> create materialized view sales_mv_hi_sales
  2  build immediate
  3  enable query rewrite
  4  as
  5  select   product_cd,
  6           location_cd,
  7           date_of_sale,
  8           sum(sales_amt) s_sales_amt
  9  from     sales
 10  group by product_cd,
 11           location_cd,
 12           date_of_sale
 13  having   sum(sales_amt) >= 999
 14  /

Materialized view created.

SQL>
SQL> begin
  2     dbms_stats.gather_table_stats(
  3        user,
  4        'sales_mv_hi_sales',
  5        method_opt => 'for all indexed columns size 254');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select   product_cd,
  2           location_cd,
  3           date_of_sale,
  4           sum(sales_amt) sum_sales_amt
  5  from     sales
  6  group by product_cd,
  7           location_cd,
  8           date_of_sale
  9  having   sum(sales_amt) > 999.9
 10  /

Execution Plan
----------------------------------------------------------
Plan hash value: 98437802

--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name              | Rows  | Bytes | Cost(%CPU)| Time      |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                   |    17 |   595 |     3   (0)| 00:00:01 |
|*  1 |  MAT_VIEW REWRITE ACCESS FULL| SALES_MV_HI_SALES |    17 |   595 |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("SALES_MV_HI_SALES"."S_SALES_AMT">999.9)

We have confined the materialized view to so few rows that indexing is not required to bring the cost estimate down to 3.

Next we assume that they may be interested in a few different ranges. We create a partitioned materialized view that allows pruning based on the sum of sales.

SQL> drop  materialized view sales_mv_part;

Materialized view dropped.

SQL> drop materialized view sales_mv_hi_sales
  2  /

Materialized view dropped.

SQL>
SQL> create materialized view sales_mv_part
  2           (
  3           product_cd,
  4           location_cd,
  5           date_of_sale,
  6           s_sales_amt
  7           )
  8  partition by range (s_sales_amt)
  9   (partition very_low_sales  values less than (1),
 10    partition low_sales       values less than (250),
 11    partition medium_sales    values less than (750),
 12    partition high_sales      values less than (999),
 13    partition very_high_sales values less than (maxvalue))
 14  build immediate
 15  enable query rewrite
 16  as
 17  select   product_cd,
 18           location_cd,
 19           date_of_sale,
 20           sum(sales_amt) s_sales_amt
 21  from     sales
 22  group by product_cd,
 23           location_cd,
 24           date_of_sale
 25  having   sum(sales_amt) >= 999
 26  /

Materialized view created.

SQL> begin
  2     dbms_stats.gather_table_stats(
  3        user,
  4        'sales_mv_part',
  5        method_opt => 'for all indexed columns size 254');
  6  end;
  7  /

PL/SQL procedure successfully completed.

SQL> select   product_cd,
  2           location_cd,
  3           date_of_sale,
  4           sum(sales_amt) sum_sales_amt
  5  from     sales
  6  group by product_cd,
  7           location_cd,
  8           date_of_sale
  9  having   sum(sales_amt) > 999.9
 10  /

Execution Plan
----------------------------------------------------------
Plan hash value: 2776446017

---------------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |    17 |   595 |     3(0)   | 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE       |               |    17 |   595 |     3(0)   | 00:00:01 |     5 |     5 |
|*  2 |   MAT_VIEW REWRITE ACCESS FULL| SALES_MV_PART |    17 |   595 |     3(0)   | 00:00:01 |     5 |     5 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter("SALES_MV_PART"."S_SALES_AMT">999.9)

Thus we demonstrate that the optimizer can also use partition pruning to optimize the rewritten query.

Finally, we turn our attention to the advantages that we can gain from the DBMS_Advanced_Rewrite package (10g+).

SQL> drop materialized view sales_mv_part
  2  /

Materialized view dropped.

SQL> begin
  2     begin
  3        sys.DBMS_Advanced_Rewrite.Drop_Rewrite_Equivalence (
  4           Name                 => 'SALES_PK_OPTIMIZE');
  5     exception
  6        when others then null;
  7     end;
  8     sys.DBMS_Advanced_Rewrite.Declare_Rewrite_Equivalence (
  9        Name                 => 'SALES_PK_OPTIMIZE',
 10        Source_Stmt          => 'select   product_cd,
 11                                          location_cd,
 12                                          date_of_sale,
 13                                          sum(sales_amt) s_sales_amt
 14                                 from     sales
 15                                 group by product_cd,
 16                                          location_cd,
 17                                          date_of_sale',
 18        Destination_Stmt     => 'select   product_cd,
 19                                          location_cd,
 20                                          date_of_sale,
 21                                          sales_amt
 22                                 from     sales',
 23        Validate             =>  TRUE,
 24        Rewrite_Mode         => 'GENERAL');
 25  end;
 26  /

PL/SQL procedure successfully completed.

SQL> select   product_cd,
  2           location_cd,
  3           date_of_sale,
  4           sum(sales_amt) sum_sales_amt
  5  from     sales
  6  group by product_cd,
  7           location_cd,
  8           date_of_sale
  9  having   sum(sales_amt) > 999.9
 10  /

Execution Plan
----------------------------------------------------------
Plan hash value: 781590677

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |    12 |   420 |   366   (5)| 00:00:05 |
|*  1 |  TABLE ACCESS FULL| SALES |    12 |   420 |   366   (5)| 00:00:05 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("SALES_AMT">999.9)

Thus we demonstrate that with this new package we do not need to materialized a new copy of the data and rewrite against it. Instead we give the optimizer the information that it can avoid the group by entirely, and if we add the appropriate indexes and partitions to the original table then we gain similar performance benefits as demonstrated for the materialized views — an exercise for the reader :D

Last, that syntactical curiosity I promised …

SQL> create materialized view sales_mv
  2  build immediate
  3  enable query rewrite
  4  as
  5  select   product_cd,
  6           location_cd,
  7           date_of_sale,
  8           sum(sales_amt)
  9  from     sales
 10  group by product_cd,
 11           location_cd,
 12           date_of_sale
 13  /

Materialized view created.

SQL> desc sales_mv
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------

PRODUCT_CD                                NOT NULL NUMBER
 LOCATION_CD                               NOT NULL NUMBER
 DATE_OF_SALE                              NOT NULL DATE
 SUM(SALES_AMT)                                     NUMBER

Fun, huh? I wasn’t expecting that.

Advertisements

3 thoughts on “More on Redundant Materialized Views

  1. Yes, aliases are A Good Thing. It seems to me that such code used to raise an error, and I only found this by chance. I am an alias kind-of guy myself.

  2. Pingback: The Oracle Sponge » Blog Archive » Upgrade Analysis and Virtualisation.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s