Iceberg time travel queries
Time travel queries for Datalake Iceberg tables are similar to Db2 temporal queries in that they allow you to specify a time period clause on a SELECT statement to query table data as it was at some specific point in time.
Iceberg time travel queries leverage a table's snapshots to determine the data that should be queried for the specified time period clause. The schema used for time travel queries is always the current table schema.
The time period clause can reference a snapshot id, a named reference to a snapshot id or a timestamp. When specifying a timestamp, the snapshot taken on or before the timestamp is used. The information needed to perform time travel queries using a table's snapshot id or timestamp can be obtained using the TABLE_SNAPSHOTS table function. The information needed to perform time travel queries using table snapshot references can be obtained using the TABLE_SNAPSHOT_REFS function.
select * from employee for system_time as of '2024-08-04 08:26:30'
ID NAME AGE
----------- -------------------- -----------
1 Sam Smith 25
2 Sarah Richards 35
3 Amy Dean 44 Since time travel queries rely on table snapshots, expiring them using the EXPIRE_SNAPSHOTS stored procedure will affect your ability to perform time travel queries. If for example, you expire one or more snapshots, you will not be able to query the data as it existed when those snapshots were taken.
Time period clauses:
FOR SYSTEM_VERSION AS OF snapshot id or named reference
-
The AS OF value can be specified as a snapshot ID or a named reference to a snapshot. The rows returned are those in the table when the snapshot was taken.
FOR SYSTEM_TIME AS OF timestamp
-
The AS OF value specifies the point in time used for the query. This value is used to select the snapshot taken on or before the timestamp value. The rows returned are those in the table when the snapshot was taken.
Usage Notes
It is possible to use the Db2 CURRENT TIMESTAMP and CURRENT DATE special registers to specify the timestamp value. These special registers return the time-of-day clock value when the SQL statement is executed on the current Db2 server which may be different from the time-of-day clock on the system used to generate a table's snapshots resulting in unexpected results. To ensure the intended data is returned for queries using a timestamp, it is recommended you use the CREATE_TS value returned by the TABLE_SNAPSHOTS table function.
Examples
SELECT * FROM EMPLOYEE FOR SYSTEM_TIME AS OF '2024-08-03-08.26.30.773000'
ID NAME AGE DEPT_ID
----------- -------------------- ----------- -----------
1 Sammy Bean 27 111
2 Sarah Connor 35 111
SELECT * FROM EMPLOYEE for SYSTEM_VERSION AS OF 1429911045991981825
ID NAME AGE DEPT_ID
----------- -------------------- ----------- -----------
1 Sammy Bean 27 111
SELECT * FROM EMPLOYEE FOR SYSTEM_VERSION AS OF 'release_one'
ID NAME AGE DEPT_ID
----------- -------------------- ----------- -----------
1 Sammy Bean 27 111
2 Sarah Connor 35 111
SELECT E.ID, E.NAME, D.DEPT_ID, D.DEPT_NAME FROM EMPLOYEE FOR SYSTEM_VERSION AS OF 257905915430810024 E, DEPARTMENT FOR SYSTEM_TIME AS OF '2023-06-02-08.00.00' D WHERE E.DEPT_ID = D.DEPT_ID ORDER BY E.ID
ID NAME DEPT_ID DEPT_NAME
----------- -------------------- ----------- --------------------
1 Sammy Bean 111 dept 111
2 Sarah Connor 111 dept 111
3 Fred Rogers 222 dept 222