Why CURRENT TIMESTAMP produces poor primary keys
Comments (7) Visits (25663)
I've just finished debugging some quality assurance (QA) scenarios which failed just on Windows.
The failure ended up being in the test case implementation rather than DB2.
The kind of failure, however was interesting although I have seen the pattern repeatedly since my early days in DB2 development both internally as well as in customer code.
Therefore I figure it's a good topic to blog about.
Imagine a table recording events - perhaps for auditing, logging. Anything of the sorts:
CREATE TABLE event(stamp TIMESTAMP NOT NULL PRIMARY KEY, val INTEGER);
We now want to insert events and time stamp them as we do so.
The obvious choice is the CURRENT TIMESTAMP register
INSERT INTO event VALUES(CURRENT TIMESTAMP, 1);
This worked as expected. But what happens when we insert two rows at once?
INSERT INTO event VALUES (CURRENT TIMESTAMP, 2), (CURRENT TIMESTAMP, 3);
CURRENT TIMESTAMP is evaluated once at the beginning of an SQL Statement.
So two references will result in the same value.
This is by design so you can run predicates such as "CURRENT TIMESTAMP > c1 AND CURRENT TIMESTAMP < c2".
Fair enough - turn this into a batch of multiple SQL statements and we should be fine:
BEGIN ATOMIC ... END is still one SQL statement! This is an "inline" compound statement.
The phenomenon of a single CURRENT TIMESTAMP invocation it true for inline triggers, inline SQL functions and inline compound statements.
In order to get different values you must use a compiled composite objects such as a compiled trigger, compiled function or compiled compound.
Executing two different SQL statements will work as well, of course.
You know a SQL PL object is compiled if it uses BEGIN without the ATOMIC keyword or if the object is a stored procedure.
One more time without ATOMIC:
BEGIN INSERT INTO event VALUES(CURRENT TIMESTAMP, 1); INSERT INTO event VALUES(CURRENT TIMESTAMP, 2); END; /
Now that is interesting!
We did everything by the book and I can assure you that DB2 indeed retrieved the CURRENT TIMESTAMP from the system clock twice.
Yet, both invocations resulted in the exact same value.
This is because the system clock has a certain granularity.
On Windows two clock ticks are milliseconds apart.
So when you use CURRENT TIMESTAMP you have no guarantee you get a distinct value.
Further more, depending on the operating system the likely hood to run into this issue differs.
Note, that in older versions of DB2 CURRENT TIMESTAMP in fact remembered the last generated value and nudged the new value up by a bit to assure uniqueness.
But this was never documented behavior and eventually fell victim to performance optimizations.
How bad is the situation actually? A test will show:
DELETE FROM event;
It took 640 loops of assignments to get a net-new CURRENT TIMESTAMP.
Note that your measurements may be different since the outcome depends on timing between clock ticks, speed of the machine and OS.
To add insult to injury you may have no problems for years with sufficient elapsed time between two CURRENT TIMESTAMP invocation.
And then, because of a better plan, or faster machine you suddenly start seeing duplicate key errors.
This is what happened in those testcases I was debugging.
So what we have here is a real problem.
DB2 does have a means to get a timestamp that is good as a primary key.
We start of with the GENERATE_UNIQUE() function.
This function returns a VARCHAR(13) FOR BIT DATA which consists of a unique timestamp and some added information to enforce uniqueness across an MPP database.
You can see the BCD encoded date at the beginning of the hex string.
There is a little known TIMESTAMP() function which takes GENERATE_UNIQUE() as an input and extracts the timestamp portion from that.
Using that we can build ourselves a function which generates ever increasing unique timestamps.
Since the timestamp of generate_unique() is in UTC we need to readjust it to system time zone using CURRENT TIMEZONE(if that's what we want).
CREATE OR REPLACE FUNCTION new_timestamp() RETURNS TIMESTAMP
To make a long story short: Don't use CURRENT TIMESTAMP to generate unique values.