Optimizing Materialized Views Part V: HOLAP Query Performance

HOLAP Performance With Different Aggregation Levels

I’m going to construct a slightly different test data set for this experiment, involving the aggregation of 10,000,000 by three different columns each having around 1000 distinct values.

drop table master;

create table master
 col1    not null,
 col2    not null,
 col3    not null,
 tx_qty  not null
pctfree 0
connect by
 1=1 and
  level <= 1000000;

The performance test queries will take the form of a query such as the following:

clear screen
alter session set workarea_size_policy = manual;
alter session set sort_area_size = 1000000000;
alter session set sort_area_retained_size = 1000000;
set timing on autotrace on feedback off

-- loop: the following executed four times
alter system flush buffer_cache;
select sum(s_tx_qty) from
 sum(tx_qty) s_tx_qty,
 {grouped column list}
From master group by grouping sets
 ({grouping sets});
-- end of loop

In the four executions of the query the first invokes a hard parse and the autotrace statistics are discarded. The autotrace results of the following three results are averaged.

We will try aggregations at a number of different levels. Some of these will involve hierarchies and some will not, but the aim is to expose as full a range of optimizations as we can. For the sake of my time, my sanity, and blog length I’m going to run these to get simple wall clock times and autotrace statistics for now. The host machine and database will be as quiescent as possible.

We’ll start with simple, conventional aggregations to get benchmarks for familiar operations.

Benchmark 1 Benchmark 2 Benchmark 3 Benchmark 4

In the following table we present the results of the various grouping set tests, with the “Y” indicating that the grouping set was included as part of that test.

Grouping Set(s) Used Test 1 Test 2 Test 3 Test 4 Test 5 Test 6 Test 7 Test 8

Test 1 is a straight aggregation to find the sum of tx_qty, but expressed as a grouping set of () rather than the usual “Select Sum(my_col) From my_table” syntax. The autotrace statistics are also very conventional and show physical reads very close to consistent gets, with no db block gets or recursive SQL.

This test is logically the same as Benchmark 1, but required an elapsed time 50% longer. The execution plans are in fact different, with the SORT AGGREGATE of the conventional method being replaced with a SORT GROUP BY NOSORT in the grouping sets method.

Test 2 is similarly conventional, finding the sum of tx_qty grouped by col1. Again this is expressed as a grouping set (col1) rather than the regular syntax and the autotrace statistics are very similar to Test 1 with the addition of an in-memory sort. The performance is very similar to that of Benchmark 2, to which it is logically equivalent, and both queries are using a SORT GROUP BY operation.

Test 3 gives our first hierarchical result set, combining the results of Test 1 and Test 2. In fact the result set for Test 3 is logically equivalent to (Benchmark1 UNION ALL Benchmark 2), and the benefits of the HOLAP cube begin to show themselves. The average elapsed time of 17.84 seconds is only 62% of the elapsed times for Tests 1 and 2 combined. The execution plan shows a SORT (GROUP BY ROLLUP) operation which is evidently an optimization specifically for hierarchical cube queries. It appears either that the result for the (COL1) grouping set is being used as the basis for the () grouping set calculation, or that the results for both sets are being computed in the same operation. As we will shortly see, the latter appears to be the more probable scenario.

Test 4 adds into Test 3 a grouping set of (Col2), giving two sets at the same level with no more detailed level included. This result ought to be logically similar to (Test 2 UNION ALL Test 3) — with the appropriate columns being used — or (Benchmark 1 UNION ALL Benchmark 2 UNION ALL Benchmark 3). The result is surprising because this time the performance is well in excess of those comparisons, and gives the first hint at a possible contra-indication for HOLAP cube MV’s. The elapsed time of 63.57 seconds is 86% higher than the sum of the elapsed times for Tests 2 and 3, of 34.18 seconds.

The explain plan show the creation of two temporary tables, although when the statement is executed these temporary table names are of course different.

| Id  | Operation                    | Name                        | Rows  | Bytes | Cost (%CPU)| Time     |
|   0 | SELECT STATEMENT             |                             |     1 |       |   170  (85)| 00:00:02 |
|   1 |  SORT AGGREGATE              |                             |     1 |       |            |          |
|   2 |   VIEW                       |                             |   707K|       |   170  (85)| 00:00:02 |
|   3 |    TEMP TABLE TRANSFORMATION |                             |       |       |            |          |
|   4 |     LOAD AS SELECT           |                             |       |       |            |          |
|   5 |      TABLE ACCESS FULL       | MASTER                      |    10M|   114M|   162  (88)| 00:00:02 |
|   6 |     LOAD AS SELECT           |                             |       |       |            |          |
|   7 |      SORT GROUP BY           |                             |     1 |    26 |     3  (34)| 00:00:01 |
|   8 |       TABLE ACCESS FULL      | SYS_TEMP_0FD9D6665_1214756C |     1 |    26 |     2   (0)| 00:00:01 |
|   9 |     LOAD AS SELECT           |                             |       |       |            |          |
|  10 |      SORT GROUP BY ROLLUP    |                             |     1 |    26 |     3  (34)| 00:00:01 |
|  11 |       TABLE ACCESS FULL      | SYS_TEMP_0FD9D6665_1214756C |     1 |    26 |     2   (0)| 00:00:01 |
|  12 |     VIEW                     |                             |     1 |    39 |     2   (0)| 00:00:01 |
|  13 |      TABLE ACCESS FULL       | SYS_TEMP_0FD9D6666_1214756C |     1 |    39 |     2   (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------------

Here are the STAT lines from a 10046 trace file:

STAT #6 id=1 cnt=1999 pid=0 pos=1 obj=0 op='TEMP TABLE TRANSFORMATION  (cr=71632 pr=66687 pw=23191 time=58587487 us)'
STAT #6 id=10 cnt=1999 pid=1 pos=1 obj=0 op='VIEW  (cr=144 pr=6 pw=0 time=27785 us)'
STAT #6 id=11 cnt=1999 pid=10 pos=1 obj=-40016278 op='TABLE ACCESS FULL SYS_TEMP_4254951018 (cr=144 pr=6 pw=0 time=19771 us)'
STAT #6 id=12 cnt=1 pid=1 pos=2 obj=0 op='LOAD AS SELECT  (cr=25074 pr=20328 pw=23185 time=19896368 us)'
STAT #6 id=13 cnt=10000000 pid=2 pos=1 obj=260950 op='TABLE ACCESS FULL OBJ#(260950) (cr=25074 pr=20328 pw=0 time=30000104 us)'
STAT #6 id=14 cnt=1 pid=1 pos=3 obj=0 op='LOAD AS SELECT  (cr=23208 pr=23185 pw=3 time=18510637 us)'
STAT #6 id=15 cnt=999 pid=4 pos=1 obj=0 op='SORT GROUP BY (cr=23208 pr=23185 pw=0 time=18508219 us)'
STAT #6 id=16 cnt=10000000 pid=5 pos=1 obj=-40016279 op='TABLE ACCESS FULL SYS_TEMP_4254951017 (cr=23208 pr=23185 pw=0 time=30022328 us)'
STAT #6 id=17 cnt=1 pid=1 pos=4 obj=0 op='LOAD AS SELECT  (cr=23206 pr=23168 pw=3 time=20142290 us)'
STAT #6 id=18 cnt=1000 pid=7 pos=1 obj=0 op='SORT GROUP BY ROLLUP (cr=23206 pr=23168 pw=0 time=20140880 us)'
STAT #6 id=19 cnt=10000000 pid=8 pos=1 obj=-40016279 op='TABLE ACCESS FULL SYS_TEMP_4254951017 (cr=23206 pr=23168 pw=0 time=30018947 us)'

Cleaned up a little, this reads more easily as:


The temporary table definitions are as follows:

 ("C0" NUMBER,
  "C1" NUMBER,
  "A0" NUMBER )

 ("C0" NUMBER,
  "C1" NUMBER,
  "D0" NUMBER,
  "A0" NUMBER )

You’ll note that the global temporary tables’ OBJNO clause correlate to part of the temporary table name referenced in the STAT lines – a bit unintuitive that, if you ask me.

So how do we interpret this? Fortunately, a 10046 trace file contains some interesting clues.

Although it shows the creation of the temporary tables it does not reveal any other recursive SQL against them. However it does show wait events of course, and where there is a ‘db file scattered read’ there are also p1 and p2 parameters which tell you the file number and block number respectively for the object being read. Now the only objects involved here ought to be our MASTER table and our two temporary tables, and we know the file and block numbers for the MASTER table are available from DBA_EXTENTS.

So I’m going to break off there for this entry, and present the analysis of the execution plan, and more, next time.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s