Data Type of Partition Key Column Cannot Be Inferred Through Foreign Key Reference

I remember coming across this issue a few years ago and as I just hit it again I thought I’d “pass the note round the class”.

When you create a table that has foreign keys then a nicely robust way of specifying the data type for the foreign key columns is by allowing Oracle to infer it. Try the following script … Continue reading

Advertisements

Partition Pruning and Bind Variables

Prompted by a question at the Dizwell Forum, here is a script to demonstrate that using bind variables in partition key predicates causes Oracle to use global (table) statistics instead of partition (or subpartition) statistics.

Script

Result on 9.2.0.4

This is similar behaviour to Oracle’s use of global statistics in other circumstances, such as when joining on a partition key to a smaller table on which a predicate is placed. Anyway, see the forum post for other comments :D

Histograms For Join Predicates (or “Hooray for technical content!”)

Someone sent me a scenario the other day similar to this:

A small dimension table, for example of US State names, has a synthetic primary key (say STATE#) and a unique key on the real-world value (STATE_NAME), and a much larger fact-type table has an indexed foreign key to this lookup table. The distribution of values in the fact table is very skewed. Is it possible for the optimizer to take into account the skew of STATE# in the fact table when it is joined to the dimension table and a predicate is placed on STATE_NAME?

Well, a tricky issue. Continue reading

The Three Pillars of Oracle Data Warehousing

Introduction

This is a basic topic for Oracle data warehousing beginners, based on some ideas that I’m hoping will stop buzzing round in my head if I commit them to virtual paper.

There are three Oracle features that provide a foundation for successful data warehousing:

  • Partitioning
  • Parallelism
  • Psummary Tables (the “p” is silent)

Here are the benefits that they bring to the system. Continue reading

My Dishwasher is List Partitioned

My dishwasher has two levels in it. The bottom level is for plates and Large Things, the top level is for cups and Small Things, and there is a removable cutlery partition attached to the door.

If I had a spare cutlery partition then I could keep one of them in use by the dishwashing system while the other one is treated as a regular cutlery holder, and it could be filled with dirty cutlery as the items become available. When it was full I could perform a cutlery partition exchange between the one with clean items and the one now full of dirty items.

If I had spare levels then they could be treated in a similar manner. I believe that commercial dishwashers have exactly that configuration, thus they operate with lower downtime because of this exchange mechanism, although the overall configuration requires more space.

Within the cutlery partition there are six subpartitions. I like to fill each one with a single type of cutlery — one for knives, two for spoons (they don't stack as well), a couple for forks, and one for other items. Although it is more work to separate the items into these subpartitions it has the advantage of physically clustering all the spoons together and I can access all items of one type without having to scan the complete cutlery partition.

For the upper and lower levels similar principles apply, although they are not really subpartitioned in the same way. Instead the large plates are clustered in a single contiguous range — the small plates, the glasses and the mugs each have their own place. Again it is more work to insert the items like this, but the advantage of faster retrieval is similar because I don't have to scan the complete level to pick similar items out from dissimilar ones.

That is all.

Choosing Partitioning Keys: For ETL or Reporting?

This is one of the recursive thought processes that has been spun off from considering ways of optimizing the fast refresh of multiple materialized views: I must now be at a recursive depth well into double figures because I cannot recall how I got here at all. Anyway …

I was just browsing the 10gR2 documentation, and noticed the following comments here:

“The partitioning scheme of the data warehouse is often crucial in determining the efficiency of refresh operations in the data warehouse load process. In fact, the load process is often the primary consideration in choosing the partitioning scheme of data warehouse tables and indexes. Continue reading