Generating A Gap-free Series Of Numbers — Not Always A Problem

Well, it’s happened again. Someone has asked how to implement a requirement to generate a gap-free series of numbers and a swarm of nay-sayers have descended on them to say (and here I paraphrase slightly) that this will kill system performance, that’s it’s rarely a valid requirement, that whoever wrote the requirement is an idiot blah blah blah.

As I point out on the thread, it is sometimes a genuine legal requirement to generate gap-free series of numbers. Invoice numbers for the 2,000,000+ organisations in the UK that are VAT (sales tax) registered have such a requirement, and the reason for this is rather obvious: that it makes it more difficult to hide the generation of revenue from tax authorities. I’ve seen comments that it is a requirement in Spain and Portugal, and I’d not be surprised if it was not a requirement in many other countries.

So, if we accept that it is a valid requirement, under what circumstances are gap-free series* of numbers a problem? Group-think would often have you believe that it always is, but in fact it is only a potential problem under very particular circumstances.

  1. The series of numbers must have no gaps.
  2. Multiple processes create the entities to which the number is associated (eg. invoices).
  3. The numbers must be generated at the time that the entity is created.

If all of these requirements must be met then you have a point of serialisation in your application, and we’ll discuss that in a moment.

First let’s talk about methods of implementing a series-of-numbers requirement if you can drop any one of those requirements.

If your series of numbers can have gaps (and you have multiple processes requiring instant generation of the number) then use an Oracle Sequence object. They are very high performance and the situations in which gaps can be expected have been very well discussed. It is not too challenging to minimise the amount of numbers skipped by making design efforts to minimise the chance of a process failure between generation of the number and commiting the transaction, if that is important.

If you do not have multiple processes creating the entities (and you need a gap-free series of numbers that must be instantly generated), as might be the case with the batch generation of invoices, then you already have a point of serialisation. That in itself may not be a problem, and may be an efficient way of performing the required operation. Generating the gap-free numbers is rather trivial in this case. You can read the current maximum value and apply an incrementing value to every entity with a number of techniques. For example if you are inserting a new batch of invoices into your invoice table from a temporary working table you might:

insert into
  invoices
    (
    invoice#,
    ...)
with curr as (
  select Coalesce(Max(invoice#)) max_invoice#
  from   invoices)
select
  curr.max_invoice#+rownum,
  ...
from
  tmp_invoice
  ...

Of course you would protect your process so that only one instance can run at a time (probably with DBMS_Lock), and protect the invoice# with a unique key contrainst, and probably check for missing values with separate code if you really, really care.

If you do not need instant generation of the numbers (but you need them gap-free and multiple processes generate the entities) then you can allow the entities to be generated and the transaction commited, and then leave generation of the number to a single batch job. An update on the entity table, or an insert into a separate table.

So if we need the trifecta of instant generation of a gap-free series of numbers by multiple processes? All we can do is to try to minimise the period of serialisation in the process, and I offer the following advice, and welcome any additional advice (or counter-advice of course).

  1. Store your current values in a dedicated table. DO NOT use a sequence.
  2. Ensure that all processes use the same code to generate new numbers by encapsulating it in a function or procedure.
  3. Serialise access to the number generator with DBMS_Lock, making sure that each series has it’s own dedicated lock.
  4. Hold the lock in the series generator until your entity creation transaction is complete by releasing the lock on commit (using release_on_lock=true in DBMS_Lock.request)
  5. Delay the generation of the number until the last possible moment.
  6. Consider the impact of an unexpected error after generating the number and before the commit is completed — will the application rollback gracefully and release the lock, or will it hold the lock on the series generator until the session disconnects later? Whatever method is used, if the transaction fails then the series number(s)  must be “returned to the pool”.
  7. Can you encapsulate the whole thing in a trigger on the entity’s table? Can you encapsulate it in a table or other API call that inserts the row and commits the insert automatically?

I’m sure that there is more to add, so feel free to comment.

Finally, when someone asks “How do I generate a gap-free sequence?” can we please not all jump on the Bad Requirement Bandwagon? By all means question the need for it, advise that sequence objects are inappropriate, point out the potential problems, but please keep an open mind — sometimes a requirement really is a requirement.

* I can’t believe I actually looked up the plural of “series”, but I did. It’s “series”.

Working Around an Internal Error Based on Source Mnemonics

My frenzied hacking through the unexplored territory of Expression Filters was derailed (to mix my metaphors) by an internal error when using the Evaluate() function in a rather complex query that uses an inline view, analytic functions etc..

The following was extracted from the alert log:

ORA-07445: exception  encountered: core dump [kkqsCkLegalEqvExpCB()+199] [SIGSEGV] [Address not mapped  to object] [0x000000000] [] []

The mnemonic beginning “kkqs” in the first parameter apparantly indicates a fault raised from the query rewrite source module. Because query rewrite is not helpful to the execution of the query I coded an alter session to set query_rewrite_enabled to false, and executed the problem query again.

Success.

Interestingly, adding a NO_REWRITE hint to the query did not prevent the error occurring. A shame, because I would have preferred that approach.

According to the website query rewrite problems might also be indicated by any of the codes kkqg, kkqs, kkqs1, kkqs2, kkqs3, kkqu, kkqv, and kkqw, so disabling query rewrite might also be a good workaround for internal errors that reference mnemonics beginning with any of those strings.

It also prompts the thought that in some cases it might be possible to get guidance on working around internal erros based on the mnemonic. For example an error in “kkfd” probably indicates a parallel query fault, and the remediation might be to disable parallel query. “kxtr” … trigger problem?

And there I was thinking that all of these kernel code things were for nerds.

Stupid Yahoo Mail

I just browsed my Yahoo spam folder and found an email purporting to be from an ex-work colleague. It was sent to a bunch of other ex-colleagues as well and contained a link to what looked like a site for selling fake ED pills.

I replied to the colleague to say, “Hey looks like you’ve been hacked! Ha ha ha”, and was told by Yahoo:

Your message was not sent

Your account has been temporarily blocked from sending messages. This block can be caused by sending messages that trigger our spam filters, or by having too many recipients in one email. We encourage you to review the contents and recipient list of your message, and try sending it at after an hour or two. Doing so will usually resolve the matter.
If you are still unable to send messages after a 24-hour period, please read our FAQ for more information and to request Customer Care assistance.


We apologise for the inconvenience.

Thanks,

The Yahoo! Mail Team

Thanks for apologising for locking my account, assholes.

“If you are still unable to send messages after a 24-hour period …”? Well, I don’t think I’ll bother with your “Customer Care assistance” as it has done me fuck-all good in the past. I think I’ll just make that switch to a less irritating email provider that I’ve been thinking about for the past ten years.

A Cause of ORA-38435 in Evaluate()

I’m part way through an implementation of some ELT code that uses expression filters to parse records and assign zero-to-many “inboxes” to each row, and as is often the case with the less well-explored parts of the system it often feels like I’m hacking through the undergrowth in unknown territory.

Under such conditions it’s not surprising to hit an error that few others seem to have hit before, and for which the majority of Google hits are for that irritating type of site that merely lists all of the Oracle errors in the documentation whether or not they have information on how to fix them.

Thus it is with “ORA-38435: missing elementary attribute value or invalid name-value pairs”.

Here’s how I found the root cause of the error in my case.

It was thrown when running a SELECT statement with the EVALUATE condition. The code looks something like this (slightly obfuscated for no real reason other than paranoia):

select
 i.inbox_name,
 D.*
from
 INBOX_EXPRESSIONS i,
 DATA_TABLE D
where
 evaluate (
   I.INBOX_EXPRESSION,
     My_Event(
       D.SERVICE,
       D.FILTER,
       ... blah blah blah ...
       D.SUMMARY,
       D.IDENTIFIER).getVarChar()
 )= 1
/

Firstly, the error was thrown in a predictable fashion after something like the 450th row of the result set had been returned. I re-executed it through SQL*Plus with an arraysize of 1, slightly modified to just return the ROWID of DATA_TABLE.

This result set showed ascending values of ROWID, which agreed with the execution plan that showed a full table scan of DATA_TABLE.

The error was thrown after the ROWID “AABonGAAJAAH1UeAAA” was returned.

I modified the query to something like:

select
 i.inbox_name,
 D.ROWID
from
 INBOX_EXPRESSIONS i,
 (SELECT * FROM DATA_TABLE WHERE ROWID = 'AABonGAAJAAH1UeAAA') D
where
 evaluate (
   I.INBOX_EXPRESSION,
     My_Event(
       D.SERVICE,
       D.FILTER,
       ... blah blah blah ...
       D.SUMMARY,
       D.IDENTIFIER).getVarChar()
 )= 1
/

This returned a correct row.

I then incremented the ROWID to ‘AABonGAAJAAH1UeAAB’ and re-executed, getting 0 rows. I kept incrementing until I reached ‘AABonGAAJAAH1UeAAF’ at which point the error was thrown.

Ah hah.

From the error stack it looked as if the error was raised by Evaluate, rather than the function My_Event.GatVarChar(), so I tested that by including the call to My_Event.GetVarchar() in a query against the columns of DATA_TABLE, and to see what it was returning for each row. Hopefully I’d spot something that looked like bad syntax on the error-raising row that might trip up the Evaluate() function.

And there it was:

SERVICE=>to_number('0'),FILTER=>'abc', ... ,SUMMARY=>'System's IPs are not responding.',...'

The inclusion of a single quote in the value of the column SUMMARY caused the GetVarchar function to construct an invalid string that was then passed to Evaluate().

The fix was fairly straightforward of course — replace every occurrence of a single single-quote with two single-quotes in the string columns.

select
 i.inbox_name,
 D.*
from
 INBOX_EXPRESSIONS i,
 DATA_TABLE D
where
 evaluate (
   I.INBOX_EXPRESSION,
     My_Event(
       D.SERVICE,
       Replace(D.FILTER,'''',''''''),
       ... blah blah blah ...
       Replace(D.SUMMARY,'''',''''''),
       Replace(D.IDENTIFIER,'''','''''')).getVarChar()
 )= 1
/

I’d not be surprised if there were other potential causes.

And by the way the handsome list of single-quotes reminds me of this:

James, while John had had “had”, had had “had had”; “had had” had had a better effect on the teacher

My_Event.GetVarchar()

Serious Space Savings

I knew this would happen before I even tried it.

I Request: “The SYSAUX tablespace is 500MB and it’s nearly full. The contents are mostly the default 30 days worth of DBMS_Scheduler logs. Can you increase it to 2GB?

A DBA Responds: “I’ve increased it to one GB for now“.

An entire gigabyte of disk space saved. Well done.

Now why didn’t I ask for 4GB?

Pushing the Technical Frontier

I have a dilemma.

In the BI system that I’m working on I have the perfect application for the use of Expression Filters. It is almost as if someone specifically devised this problem in order to tempt me into using them, and I really am very tempted indeed.

Let me explain.

In the application which feeds data to the BI system users can define “display filters” that determine whether network alerts are displayed in particular windows or not. For example they might create a filter called “Failed Core Devices” for which the definition reads: “IS_CORE=’Y’ and FAILED>=1”. Network events which meet those criteria are displayed in that window. They might also have a filter called “Failed Cisco Devices” with a definition: “DEV_TYPE LIKE ‘CISCO’ AND FAILED >= 1”. So, a failed Cisco core device will actually appear in both windows.

The requirement is to tag which windows an event was displayed in at a particular time, and because there are no data elements for each event to tell you the display window(s) the only way to work that out is to evaluate the expression for each event.

This is exactly what an Oracle Expression Filter does, and although I would have to convert the definitions from Netcool ObjectSQL to Oracle SQL (LIKE in Netcool ObjectSQL is a regular expression, so “DEV_TYPE LIKE ‘CISCO'” has to be converted to “REGEXP_LIKE(DEV_TYPE,’CISCO’)”), they are an obvious fit for the problem.

My query to determine the filter names for the events would be reduced to something like:

select
  e.event_id,
  f.filter_name
from
  events e,
  filter_definitions f
where
  Evaluate(
    f.filter_condition,
    get_filters.getVarChar(
      e.dev_type,
      e.is_core,
      e.failed)) = 1

Which would obviously be pretty neat and tidy, considering the alternatives (think of how you might implement an expression such as “(COL1 NOT IN (1,3,4,5,78,900) AND COL2 IS NOT NULL) OR NOT REGEXP_LIKE(COL3,’^WHATEVER’)” by other means as simply as you would with an expression filter). It could be implemented as part of the ELT or as a filter or join in Business Objects (or your current query tool of choice).

However, what happens if I move on to another client and this method fails because of a new filter definition that references a new data element not currently included in the Expression Filter attribute set? Or an upgrade goes wrong? Or a patch breaks it? Or we discover a bug? Can I sleep at night if I’m leaving code behind that will have most reasonably competent Oracle practitioners reaching for Google to find out what EVALUATE is, and then has them reading documents for the next couple of hours? Even if I sit the support staff down and show them all how it works, that knowledge will have evaporated in a couple of months.

Documentation? Well I can certainly document how expression filters work, and how to detect problems and perform apparantly simple tasks like modifying the attribute set (I already did something similar for DBMS_Scheduler chains), but I’m not going to think of everything and even then do I want people to be following written instructions without understanding the architecture of expression filters? It’s pretty easy to get into a mess over this object oriented stuff.

The alternative? Currently unknown. The few options identified are prone to inaccuracy or imprecision (eg. they’ll only identify a single filter instead of the full set), and they tend to require continual maintenance of the ELT or the query tool or both whenever a filter is created, deleted or modified.

So to summarise: do I implement a solution that is a technical exact-match with potentially high future maintenance problems, or a poor technical solution with definite low-level ongoing maintenance problems.

At the moment, I have no idea. Input welcome.