It seems that with every passing month I spend more and more of my time thinking about the cost-based optimizer, which makes the continual changes to it with different Oracle versions a rather bitter-sweet experience. I get the pleasure of learning new “stuff” and the downside of realising how much of what I know is going to be redundant.
My client is soon upgrading from 9iR2 to 10gR2, and that is keeping me hopping with plenty of analysis and planning. Today I had the pleasure of running some more tests to produce some formal analysis of the upgradebenefits. I based the work heavily on two Oracle white papers that each describe optimizations with the move from 9iR2 to 10gR1 and from 10gR1 to 10gR2. So the work was really a consolidation of the two, minus some features that I thought the client wouldn’t benefit from.
The major performance benefits that I expect from the upgrade are, in order of advantage:
- Hash-based GROUP BY: This is a major benefit because GROUP BY is ubiquitous within the data warehousing environment. Wwe use it in generating summary data and in virtually every end-user query (except where the presence of a summary allows it to be optimised out of the execution plan).
- In-memory sort optimisation: I measured some very impressive results in sorting large data sets. This is mostly an ETL benefit as the reports in the clients environment generally sort only small results sets, but where the ETL process does sort it is for the purpose of promoting physical row ordering of fact tables. The amount of data to be sorted is generally very large and the sort key is often many columns wide.
- Pruning of disjunct-OR predicates: 10gR2 has some optimisations that improve the efficiency of pruning based on non-contiguous predicates placed on dimension tables. Many users will benefit from this for queries that require data between January 2006 and MArch 2006 OR between January 2005 and March 2005 — ie. year-to-year comparisons.
Aside from these transparent optimizer issues we are planning on eliminating our use of materialized view in favour of hand-coded summary table maintenance and use of DBMS_ADVANCED_REWRITE. In fact we already use hand-coded maintenance of the summaries, which yielded reductions in the update times from several hours down to several minutes, but not having the materialized view placed on them will make the management of them easier to comprehend for those .. um … at the back of the class.
Lastly there are some significant changes to initialisation parameters that affect partition pruning, so I’ll be having a close look at those as well.
So on to virtualisation.
Because of this increased focus on optimisation I find myself needing more and more specific versions of Oracle to test on, and because the maintenance of many versions on my regular system causes havoc with other applications such as Business Objects and Informatica I’ve gone virtual with the use of VMWare — a path that saner minds have been advocating for quite a while. You know who you are ;)
I’ve standardised myself on the use of CentOS 4 as an operating system, as it has not let me down before, but I’ve also got some OpenSuse 10.1 and Solaris 10 VMs running in case I need a diversion. The plan is to maintain Oracle versions as follows:
- The same as the client’s production system (say, 10.2.0.2).
- The “vanilla” version of the client release (10.2.0.1).
- The “bleeding-edge” version of the client release (10.2.0.3).
- Other significant releases, such as vanilla and bleeding-edge 10.1 and 9.2.
That ought to smooth the bumps on the road of analysis, anyway, and thanks to some help from HJR with networking I can access them all pretty seamlessly from the host machine (XP Pro with 1.5G of memory and 1Tb spread over five disks, by the way).
One handy trick I’ve found is that VMWare will mount the ISO files that most distributions provide, avoiding the need for burning an actual disc and allowing multiple VMs to have access to different ISO’s at the same time. If you have a capable burner — and my own choice is the clunky but reliable Cheetah — then you can also transmogrify the downloaded files from Oracle for installs and patchsets into ISO format and mount them as well, so you don’t have to burn them or ftp them or whatever. Handy stuff.