Iceberg table snapshots
An Iceberg table snapshot represents the state of the table at some time and is used to access the complete set of data files in the table when the snapshot was taken.
A snapshot is taken whenever rows are added, updated, or deleted from a table. A snapshot has an associated timestamp indicating when the snapshot was taken and an identifier that uniquely identifies the snapshot.
The example below illustrates how snapshots work. In this example, an Iceberg table called
EMPLOYEE is created and rows are inserted into the table. The first insert statement adds two rows
and the next two statements add one row each. After the insert statements complete, the EMPLOYEE
table will have four rows and three snapshots. The first snapshot references the initial two rows,
the second references the initial two rows and the third row, and the third snapshot references all
four rows.
CREATE DATALAKE TABLE EMPLOYEE(ID INT, NAME VARCHAR(20), AGE INT) STORED BY ICEBERG LOCATION 'db2remote://default//employee'
INSERT INTO EMPLOYEE VALUES(1, 'Sam Smith', 25), (2, 'Sarah Richards', 35)
INSERT INTO EMPLOYEE VALUES(3, 'Amy Dean', 44)
INSERT INTO EMPLOYEE VALUES(4, 'John Richards', 51)
SNAPSHOT_ID | CREATE_TS | Table Rows |
---|---|---|
1429911045991981825 | 2024-08-01-08.26.29659 | 2 |
1024393568197109609 | 2024-08-03-08.26.30.773 | 3 |
257905915430810024 | 2024-08-05-08.26.31.888 | 4 |