Buffer Busy Waits

Here’s a curiosity. Or a banal observation, depending on your perspective.

There are, broadly speaking, two different causes of buffer busy waits. They are nicely described in the documentation here.

  1. Waiting for a block to be read into the buffer cache by a different session.
  2. Waiting for a block in memory to become available in some way (maybe it’s being modified by another session, for example)

The second of these is likely to be observed in an OLTP system, the first of them is more likely in a reporting system with multiple nonparallel full table/partition scans.

The curiosity is that the cure for these two causes are exactly the opposite of each other. To reduce the number of buffer busy waits you either have to increase the number of rows per block (to reduce the number of blocks needed to fulfill a full table scan) or reduce the number of rows per block (to reduce the chance of multiple sessions requiring rows that are in the same block).

So what to make of a consultant who sees a high number of buffer busy waits on a read-only reporting system, and advises that you increase pctfree or partition the table (sic) to reduce the number of rows per block?

Incidentally, I have a headache. Possibly it is caused by spending so much of the day frowning.

About these ads

10 thoughts on “Buffer Busy Waits

  1. I know some of your questions are rhetorical – but I’m trying to learn here. So I’d like to just verify my answers with you here real quick. I understand why reducing the number of rows per block is not going to help. I think increasing pctfree would achieve the goal of reducing the number of rows (you just don’t really want to). But I don’t understand that partitioning the table would even accomplish reducing the rows per block, unless it were partitioned into many small pieces. Is that right or am I missing something?

  2. I didn’t spend to much time thinking about it, to be honest, but my gut reaction was the same: “how would that help”?

    I wonder though whether it is a matter of spreading DML activity across multiple data and index segments, and thus reducing contention for header blocks. Not really my speciality and not relevant to the root cause of the symptom in this case, so if anyone else has an explanation for why this might reduce BB waits in some circumstances then i’d be very happy to hear them.

    Actually, I suppose it might also help with reducing FTS-related BB waits if partition pruning can reduce the overall i/o requirement for an FTS, but the effectiveness of a remediation like that really rests on reducing i/o requirement rather than reducing BB waits, in the same way that increasing rows per block does. BB waits caused by full table scans are really what I’d think of as a secondary symptom of a problem — the primary one being a high scattered (or direct path) read volume.

  3. Did anyone happen to look at v$segstat? waitstat? Is the system so busy that processes can pin the buffer and go to sleep and make everyone else wait? I think you may be oversimplifying the two causes, there may be side effects of other problems. So what other problems are there?

  4. never worry, David:

    the consultant will be out of there long before any responsibility for those recommendations can be attributed.

    Then it will be the problem of the resident techo to explain to damagement how did he make the “expert solution” fail.

    And the show goes on…

  5. Joel, the system isn’t busy in any sense other than i/o. CPU usage is really very low, around the 10% mark even when the i/o is pegged, and the memory is entirely adequate. The performance problems we have are really associated with very low host-to-SAN bandwidth, which is also being addressed. Until that’s sorted out I think there’s little to be gained elsewhere.

    Noons, we had a similar situation just the other week. A consultant recommended increasing max parallel servers from 16 to 120, which seemed to promote some interesting changes in execution plan. In particular we started to get a lot of parallel activity on indexes instead of parallel FTS, and that not only slowed us up but demonstrated in the worst way possible that at some point in the SAN the affected reporting system crosses paths with Siebel. Well, at least we proved that suspicion, and it was easy to revert the change.

    The consultant is coming back for a more in-depth look on a two week engagement — the previous analysis was based on a two day skim of a handful of systems. Should be an interesting time.

  6. Hmm, you might well see a paradoxical effect by reducing the rows per block, by increasing the chance each request is for a block not blocked. Either that or the opposite. You might be seeing a lot of bs waits about now :-)

    But there is also the possibility you have fts from one or more processes catching up to the block of the fts of a leading process, in which case you would see them all being slower having to read more blocks off the slow device. Jonathan Lewis noted that in a thread in cdos helping a newbie not too long ago, Oracle Performance — Possible Disk Bottleneck Options (search for concurrent tablescans).

  7. “But there is also the possibility you have fts from one or more processes catching up to the block of the fts of a leading process, in which case you would see them all being slower having to read more blocks off the slow device. ”

    Yes, exactly the case I think. This is exacerbated by having scheduled reports kick off at the same time and requesting the same FTS, which tends to promote BB waits.

    However, I’m inclined to think that this really isn’t much of a problem. That’s because, to take a theoretical case, although you might have three processes all scanning the table at the same time and thus have one waiting predominantly on scattered reads and the other two on BB waits (to over simplify), the alternative is actually to incur three times the number of physical read requests. So, on a system with near-to choked i/o, two processes are essentially getting a free ride on the back of another that is requesting the physical reads on behalf of them all. If the two BB wait-heavy processes were issuing their own read requests then that would represent a net increase in the physical read workload, and slower system throughput. Paradoxically, the presence of FTS-related BB waits on a system with choked i/o is actually better than one of the alternatives that would eliminate them.

    Well, maybe there’s a complexity that I’m not thinking of there, but the paradox of Buffer Busy Waits as a sign of efficiency appeals to me, even if it is in the extremely narrow context of a dysfunctional system.

  8. mmmm, of course some actual measurements would decide if the waits are better, but I can’t help thinking of 25 years ago when I had this same issue on a datatrieve/RMS system (you could watch the progress simply by watching the reports grow), and running the reports serially was way faster. What would worry me in the modern case is latches being held by sleeping processes. How bad that is is likely version dependent, I know it used to be bad.

  9. I meant to say also, “so what if there are three times more disk accesses, if they aren’t interfering with each other? Choking systems go downhill fast.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s