Upgrade Analysis and Virtualisation.

It seems that with every passing month I spend more and more of my time thinking about the cost-based optimizer, which makes the continual changes to it with different Oracle versions a rather bitter-sweet experience. I get the pleasure of learning new “stuff” and the downside of realising how much of what I know is going to be redundant.

My client is soon upgrading from 9iR2 to 10gR2, and that is keeping me hopping with plenty of analysis and planning. Today I had the pleasure of running some more tests to produce some formal analysis of the upgradebenefits. I based the work heavily on two Oracle white papers that each describe optimizations with the move from 9iR2 to 10gR1 and from 10gR1 to 10gR2. So the work was really a consolidation of the two, minus some features that I thought the client wouldn’t benefit from.

The major performance benefits that I expect from the upgrade are, in order of advantage:

  1. Hash-based GROUP BY: This is a major benefit because GROUP BY is ubiquitous within the data warehousing environment. Wwe use it in generating summary data and in virtually every end-user query (except where the presence of a summary allows it to be optimised out of the execution plan).
  2. In-memory sort optimisation: I measured some very impressive results in sorting large data sets. This is mostly an ETL benefit as the reports in the clients environment generally sort only small results sets, but where the ETL process does sort it is for the purpose of promoting physical row ordering of fact tables. The amount of data to be sorted is generally very large and the sort key is often many columns wide.
  3. Pruning of disjunct-OR predicates: 10gR2 has some optimisations that improve the efficiency of pruning based on non-contiguous predicates placed on dimension tables. Many users will benefit from this for queries that require data between January 2006 and MArch 2006 OR between January 2005 and March 2005 — ie. year-to-year comparisons.

Aside from these transparent optimizer issues we are planning on eliminating our use of materialized view in favour of hand-coded summary table maintenance and use of DBMS_ADVANCED_REWRITE. In fact we already use hand-coded maintenance of the summaries, which yielded reductions in the update times from several hours down to several minutes, but not having the materialized view placed on them will make the management of them easier to comprehend for those .. um … at the back of the class.

Lastly there are some significant changes to initialisation parameters that affect partition pruning, so I’ll be having a close look at those as well.

So on to virtualisation.

Because of this increased focus on optimisation I find myself needing more and more specific versions of Oracle to test on, and because the maintenance of many versions on my regular system causes havoc with other applications such as Business Objects and Informatica I’ve gone virtual with the use of VMWare — a path that saner minds have been advocating for quite a while. You know who you are ;)

I’ve standardised myself on the use of CentOS 4 as an operating system, as it has not let me down before, but I’ve also got some OpenSuse 10.1 and Solaris 10 VMs running in case I need a diversion. The plan is to maintain Oracle versions as follows:

  1. The same as the client’s production system (say, 10.2.0.2).
  2. The “vanilla” version of the client release (10.2.0.1).
  3. The “bleeding-edge” version of the client release (10.2.0.3).
  4. Other significant releases, such as vanilla and bleeding-edge 10.1 and 9.2.

That ought to smooth the bumps on the road of analysis, anyway, and thanks to some help from HJR with networking I can access them all pretty seamlessly from the host machine (XP Pro with 1.5G of memory and 1Tb spread over five disks, by the way).

One handy trick I’ve found is that VMWare will mount the ISO files that most distributions provide, avoiding the need for burning an actual disc and allowing multiple VMs to have access to different ISO’s at the same time. If you have a capable burner — and my own choice is the clunky but reliable Cheetah — then you can also transmogrify the downloaded files from Oracle for installs and patchsets into ISO format and mount them as well, so you don’t have to burn them or ftp them or whatever. Handy stuff.

Advertisements

6 thoughts on “Upgrade Analysis and Virtualisation.

  1. Keep a close eye on “hash aggregation”, there are a couple of bugs around including one with “wrong results” and another with “excessive” CPU.

    I think you can find examples of both on Metalink, and I’ve seen an example of the excessive CPU on a customer site where a subquery could not unnest, and ran as a filter subquery with hash aggregation. Something went horribly wrong and the CPU usage climbed dramatically.

  2. Ah, all part of the excitement of moving to a new major release I’d say.

    I couldn’t reproduce the HASH GROUP BY wrong results bug, 4604970, on 10.2.0.2 and note 342443.1 indicates that it was fixed as part of 10.2.0.2 Patch 1 — happily we won’t have to wait until 11.1 (11w?) for this one.

    I couldn’t find the CPU-related bug, but my test results did show high CPU usage on hash group by (and also on in-memory sort optimization) compared to 9.2’s equivalents — they didn’t raise a red flag with me because although the usage was very high the total CPU usage of the operations was still lower than 9i and the duration was very much reduced. I’ll do some more investigation on those, I think. (There was no subquery involved in those tests, so maybe I’ll look at that example also).

    Thanks for the feedback, Jonathan.

  3. Just do a search on Metalink using the option:
    “and the exact phrase” with _gby_hash_aggregation_enabled,
    and select the Bug database as the source.

    Can you delete the first copy of my comment – I managed to post it by accident, and I can’t see an option to edit or delete it.

  4. Done on the delete.

    As a side note to this, isn’t Metalink infuriating sometimes? Querying bugs for _gby_hash_aggregation_enabled also yields instances where setting _gby_hash_aggregation_enabled=false was not a workaround. And it would be nice to be able to punch in an exact version number and find out which of these bugs we have to worry about without having to scan all the patch sets.

    Oh well, back to work.

  5. And it would be nice to be able to punch in an exact version number and find out which of these bugs we have to worry about without having to scan all the patch sets.

    Except… ever seen a bug which was introduced by a fix, so another fix breaks something else, and you can’t follow the trail because you hit duplicate bugs with invisible base bugs… and not every platform has the backport yet…

  6. Someone, somewhere must surely have a list of what versions of the software are prone to particular bugs though, and at what patch level they are fixed.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s