A partitioning-related performance problem came my way last week that had me scratching my head, until a 10053 trace illuminated the path to a solution.
The problem was one of poor performance for a query that joined two range partitioned tables (having different partitioning keys — in fact one is multicolumn range/list composite partitioned, the other single column range partitioned) via two dimension tables. The immediate sign of a problem was the extensive use of nested loop joins, and that led to the observation that the cardinality of the result set from one of the partitioned tables was low. Initially I thought that this was due to the use of global statistics, as the query required the use of dynamic partition pruning (albeit to a single partition). When the predicates were manually rewritten against the fact table itself the estimated cardinality reduced even more — in fact it was now 10,000 times lower than the real cardinality, being 388 against a real value of nearly four million.
Applying dynamic sampling hints at level 5 for the fact table corrected the estimation very well, but as the stored partition statistics were accurate there was more investigation to be done.
A 10053 trace was executed, and the following entry noted.
SINGLE TABLE ACCESS PATH
Column: FISCAL_MON Col#: 3 Part#: 84 Table: FCT_FSA_1 Alias: FSA
NO STATISTICS (using defaults)
NDV: 121206 NULLS: 0 DENS: 8.2504e-06
NO HISTOGRAM: #BKT: 0 #VAL: 0
As is often the case, the penny dropped immediately — no statistics were stored at the partition level for the partition key column, and this was confirmed by querying USER_PART_COL_STATISTICS. The reason for that was our extensive use of the DBMS_STATS method option clause “FOR ALL INDEXED COLUMNS SIZE 254”. Because the column has a single value per partition we are not indexing it, therefore the column was not subject to statistical analysis, therefore the optimizer was making an incorrect assumption of the cardinality.
In fact this column had been indexed in the past but a rash of performance problems involving inappropriate use of the index had caused us to drop it. From that time onwards the partitions contained no statistics on that partition key column.
I put together a script to identify all partition key columns that are missing statistics at the partition level — I believe I’ll also have to do the same for subpartitions, although in that case I’ll be looking for subpartition and partition key columns that have no statistics stored at the subpartition level.
A similar query allows the identification of those columns that require statistics to be gathered for particular partitions. I could either reanalyze the partitions or use DBMS_Stats to set the statistics manually, which ought to be pretty straightforward as the appropriate values can be derived directly from the partition names. However life is too short for the learning curve on that second option at the moment, so re-analyze it will be.
And so on to how we fix our processes to avoid this in future.
- Idea 1: Start specifically naming the columns for which we want to gather statistics
- Idea 2: Use the REPEAT clause to collect histograms only on the columns that already have them
I’m not sure that the REPEAT is what we want though. I think I’m more comfortable with just listing the columns and the appropriate size, which I’ll probably specify either as 1 (no histogram) or 254 (histogram). This number would obviously change at the partition and the global levels for columns such as FISCAL_MONTH for which there is a single value at the partition level but multiple values at the global level.
- Idea 3: I wonder whether there is value in creating a new framework for specifying statistics gathering parameters, involving a series of metadata tables that can be used to specify global, partition and subpartition-level options for building DBMS_Stats.Gather_Table_Stats calls dynamically? A table of TABLE_NAME, COLUMN_NAME, GRANULARITY, SIZE_CLAUSE, something like that. Sounds like a reasonably large undertaking that’s not going to happen this afternoon, though.
- Idea 4: Dynamic statistics gathering produced such a good estimate that I wonder whether there is value in just dropping all statistics from the table, global and local, and handing the job over to query-time estimation? It worked well for the query in question, but how would more complex queries fare? How would it impact query rewrite?