Oracle Siebel Analytics Data Warehouse: Part II

Following on from this, it occured to me this morning that accurate systems statistics may be an important performance factor for us, with all of these indexing options that the designers so thoughtfully provided. *ahem*

This is because there is much evidence that the SANs we connect to are tuned for OLTP-stlye high i/o rate performance, rather than high bandwidth BI performance. My usual instinct to promote full table scans in preference to index-based access methods when accessing a high proportion of a table have led me astray a couple of times recently and it is specifically the balance between index-based single block reads and the multiblock reads associated with table/partition scans (hypothetically supposing that we have partitions) that I want the optimizer to get right.

This very regular reason for statistics aside, we also have another interesting issue — a Very Important System that has a Very Particular Problem. Although a full-sized performance environment (PRF) is available (and used) before every release, the performance characteristics of the storage are exactly the opposite on the performance system as they are on the production (PRD) system. Specifically, PRF is fast to read and slow to write, and PRD is slow to read and fast to write (or is it the other way round? I forget). The mount points are on different SANs for security reasons (things are generally buttoned-down very tightly here, as you’d probably expect of a company with numerous external partners and a very public website) so that probably accounts for the effect in some way, but the implications are rather interesting — if we want to improve PRD performance, we have to tune the statements in PRF that are already fast.

So creating indexes in PRF is fast, because the read is large and the write is small. In production … well, let us just say it’s a bit of a time consumer. Creating a large summary table with a high volume of data performs about the same in both systems but the critical path for performance is different.

System statistics sadly do not distinguish between read and write performance in 10g (afaik), but wouldn’t it be interesting if they did? It’s a little thought experiment for later … “What database design and optimizer decisions might be affected by different balances between read and write performance?”

Hmmm …

Oh, I upped the optimizer_dynamic_sampling from 2 to 4 … or at least I raised a Remedy ticket to have a DBA do it. Another symptom of a buttoned-down environment :D

More on Partition Key Statistics, and an Oddly Harmful Constraint

As a followup to my previous post on the importance of gathering partition-level statistics on partition key columns it occured to me that if a table is list partitioned with a single value per partition then Oracle theoretically does not need statistics on that column to infer that only a single value is present. Continue reading

Partition Pruning and Bind Variables

Prompted by a question at the Dizwell Forum, here is a script to demonstrate that using bind variables in partition key predicates causes Oracle to use global (table) statistics instead of partition (or subpartition) statistics.


Result on

This is similar behaviour to Oracle’s use of global statistics in other circumstances, such as when joining on a partition key to a smaller table on which a predicate is placed. Anyway, see the forum post for other comments :D