I find database patents to be a fascinating source of information on what is, or what might in the future be, going on behind the scenes or our favourite RDBMS. They are often as good as a text book and can really expand the boundaries of how I think about the system.
Here are five interesting examples I rounded up.
Everyone who has used analytic functions seems to come away with a sense of expanded mental boundaries — they really are the stuff of Great Leaps Forward. Of such things are patents made, and here it is.
"A method and apparatus is disclosed for allowing access to one or more previously generated rows of data. A server receives a statement that requires delivery of one or more rows of data to a client, and designates a particular row of data as a current row within a generation sequence. As the rows of data are being delivered to the client, the server accesses select previously generated rows of data that has already been delivered to the client without designating a new particular row of data as the current row. Once the previously generated row is accessed, data is retrieved from predetermined columns. The data retrieved may be used to perform various calculations that will insert values into the current row prior to delivery of the current row to the client. The statement transmitted by the client may include various parameters, including a default parameter, that specifies a default value to be used by the server. The previously generated row is then retrieved based on an offset parameter specified by the client. If the location determined is outside a predetermined range of addressable rows in the generation sequence, then the default value specified by the default parameter is returned. The present invention may also be configured to use a buffer and manage dynamic memory allocated to the buffer."
Filed July 1999, issued March 2003. What takes these people so long?
Here's an interesting one.
"A method and apparatus are provided for performing deferred incremental refresh of summary tables that are derived from two or more base tables. Incremental refresh is performed by assigning a hypothetical load sequence to the base tables of the summary table. For each base table that contains new data that affects the summary table, a join is performed between
(1) the new data in the base table,
(2) the pre-update state of base tables that follow the base table in the hypothetical load sequence, and
(3) the post-update state of base tables the precede the base table in the hypothetical load sequence.
The results of the join are then merged with the existing summary table to refresh the summary table. According to one aspect of the invention, efficiency is improved by performing the joins for the various base tables in parallel."
I haven't done much work on the refresh of materialized view based on joins of multiple master tables, but when I do I'll be bearing this description in mind. What do you think is meant by "… a hypothetical load sequence …"? Pretend the one with the lowest object_id was the first? Intriguing.
"A method and apparatus is disclosed for implementing descending indexes in a database management system. A key value to be inserted within an index is decomposed into individual data bytes, such as an integer character value, that make up the key value and dictate the location where the key value will be inserted relative to other key values. A first function is applied to the data sequences in order to generate a reversed key value having reversed sorting properties. The reversed key values are then inserted into the index. The resulting index thus contains key values that are sorted in descending order. Additionally, indexes may be created on multiple key columns, with the sub_keys being independently sorted in either ascending or descending order. The present invention also discloses transformations that may be performed by a server in order to properly interpret statements that require the creation of standard or functional indexes in descending order."
Ah, so is that how they actually work then? I've read, through Tom Kyte's work I think, of descending-order indexes being function-based, and this would explain it I guess. Is the function publically accessible I wonder? If it is, is it of use outside of this?
Here is one of the issues at the heart of optimisation — in what order should the tables be accessed?
"Multiple initial orderings of tables are used a multiple starting point in a cost-base, cut-off search for a good ordering of tables in processing a database query that specifies multiple join operations. Multiple heuristics may be used to generate the multiple initial orderings of the tables. The database query is executed with the good ordering of tables."
I'd urge a good reading of the description section of the patent as an example of quality writing on the topic.
SQL optimisation again.
"Queries having a outer query block enclosing an inner query block, such as a reference to a view or a subquery, with a grouping operator, such as GROUP or DISTINCT, are transformed by merging the inner query block into the outer query block. The FROM clause of the transformed query includes the tables referenced in the FROM clauses of the outer query block and the inner query block. A new GROUP BY clause is created for the outer query block that contains the rowids of tables referenced in the FROM list of the original query and the expressions in the GROUP BY clause of the inner query block. The outer query block can be split if the outer query block also contains a GROUP BY or HAVING clause."
Here's a phrase I must use more often … "magic set transformation" … as soon as I'm more sure of what it means. Actually I think I do, and it looks like a similar technique to that used in the star transformation. So I'm going to start bandying it about forthwith.