Go back in time

Get started with temporal data, a timely new DB2 feature

A new feature in IBM DB2 Version 10 gives you the ability to manage multiple versions of data and track changes over time. Learn how you can manage temporal data using simple SQL statements to maintain a history of database changes, track effective business dates, and analyze the history of changes.

This article was originally published in IBM Data magazine. This content is part of the IBM Data Management magazine.

Sasirekha Rameshkumar, Architecture Consultant, Tata Consultancy Services (TCS)

Sasirekha Rameshkumar is an architecture consultant with more than 18 years of IT experience with mainframes, IT architecture, and performance management. She has published articles in Xephon, DB2, and ISSA Journal. You can read her blog at itknowledgeexchange.techtarget.com/enterprise-IT-tech-trends and e-mail her at sasirekha.r@tcs.com.



01 August 2011

Also available in Chinese

For most people, the words “time travel” bring to mind science fiction: H. G. Wells, Star Trek, and a certain famous DeLorean automobile. But when it comes to business data, the need to travel back and forth in time is absolutely essential. How else are you going to calculate how much your parts costs have gone up, or automatically give a limited-time discount to a valued customer, or calculate the effect of a retroactive rate change?

Previously, tracking data over time in DB2 for z/OS® required lots of manual intervention, but version 10 gives you new abilities to manage multiple versions of data and track changes over time. With simple declarative SQL statements, you can maintain a history of database changes, track effective business dates, and analyze the history of changes.

System time and business time

The basic structure of the temporal features is simple but powerful. DB2 tracks two different types of temporal data: system time and business time. System time tracks when changes are made to the state of a table, while business time tracks the effective dates of business conditions, such as the terms of an insurance policy or the interest rate of a loan. A table that includes temporal data has a pair of date-time columns that indicate the period (the time interval) when the row is valid. When a new row is inserted, DB2 records the system time as the begin timestamp. When a row is deleted, the delete time is recorded in the end timestamp.

For system time tables, DB2 also helps store historical data in a history table associated with the temporal table. When a row in a temporal table is updated, DB2 updates the row and DB2 will automatically move the previous version of the row to a history table.


Creating a temporal table with system time

To create a temporal table with system time, include the PERIOD SYSTEM_TIME clause (sys_start,sys_end) in the table definition. Also include three additional columns—for example, sys_start, sys_end, and trans_startid—as TIMESTAMP(12) in the table definition (see Listing 1).

Listing 1. Creating a temporal table with system time
CREATE TABLE policy_info
(policy_id CHAR(10) NOT NULL,
coverage INT NOT NULL,
sys_start TIMESTAMP(12) NOT NULL GENERATED
ALWAYS AS ROW BEGIN,
sys_end TIMESTAMP(12) NOT NULL GENERATED
ALWAYS AS ROW END,
trans_startid TIMESTAMP(12) GENERATED ALWAYS AS 
TRANSACTION START ID,
PERIOD SYSTEM_TIME(sys_start,sys_end));

Specifying GENERATED ALWAYS tells DB2 to automatically generate the temporal values during INSERT, UPDATE, and DELETE operations. Specify the type of each of the three columns by using the following options:

  • AS ROW BEGIN—DB2 generates a timestamp value that corresponds to the start time that is associated with the most recent transaction.
  • AS ROW END—DB2 assigns the maximum value of TIMESTAMP whenever a row is inserted or whenever any column in the row is updated (that is, a new version of the row is created). This column captures the timestamp when the row is deleted from the system-period temporal table.
  • AS TRANSACTION START ID—DB2 assigns a unique timestamp value on a per-transaction basis or the null value. DB2 generates the start time column when it executes the first statement that changes the table’s data. A transaction-start-ID column is required for a system-period temporal table. If defined as NOT NULL, the value corresponds to the start time associated with the most recent transaction (the value of the AS ROW BEGIN column). If nullable, it will contain the NULL value.

Associating the history table

Although DB2 will automatically store historical versions of rows, you must create a history table to contain the rows and associate it with the system-period temporal table. The history table must have the same number and order of columns as the system-maintained temporal table. The history table must not include a period and must have the same encoding scheme and Coded Character Set Identifier (CCSID) as the temporal table (see Listing 2).

Listing 2. Creating a history table
CREATE TABLE hist_policy_info
(policy_id CHAR(10) NOT NULL,
coverage INT NOT NULL,
sys_start TIMESTAMP(12) NOT NULL,
sys_end TIMESTAMP(12) NOT NULL,
trans_startid TIMESTAMP(12));

You can enable versioning of the temporal table by using the ALTER TABLE command with the ADD VERSIONING clause and specifying the history table name in the USE HISTORY TABLE clause (see Listing 3).

Listing 3. Enabling versioning with ALTER TABLE
ALTER TABLE policy_info
ADD VERSIONING USE HISTORY TABLE
hist_policy_info;

The ADD VERSIONING clause cannot be specified with other clauses on the ALTER TABLE statement.

To remove the relationship between the system-period temporal table and history table and archive a disassociated history table, someone with appropriate authority must copy rows to the archiving table and then delete the old rows from the history table. You can use ALTER TABLE with the DROP VERSIONING clause for the same purpose, but IBM suggests sparse use of DROP VERSIONING because it can cause invalidation, which should be avoided if possible.


Data versioning of system-period temporal tables

When inserting a new row to the system-period temporal table, DB2 automatically generates the appropriate TIMESTAMP values for its SYSTEM_TIME columns and the transaction start time. In the example, the sys_start and trans_startid have the current system time and the sys_end value is set to 9999-12-30-00.00.00.000000000000 (see Listing 4).

Listing 4. Using the three supported periodic specifications
SELECT coverage FROM policy
  FOR SYSTEM_TIME AS OF TIMESTAMP('2010-12-01')
     WHERE id = 1111;
                
SELECT count(*)FROM policy
   FOR SYSTEM_TIME FROM TIMESTAMP('2011-11-30')
   TO TIMESTAMP('9999-12-31')
       WHERE id = 1111;
                
SELECT * FROM policy
    FOR SYSTEM_TIME BETWEEN TIMESTAMP('2011-11-30')
    AND TIMESTAMP('9999-12-31')
        WHERE id = 1111;

When the row is updated, DB2 automatically maintains the old version of the row in the history table and updates the system time values in both tables. DB2 sets the sys_end value of the row in the history table and the sys_start value in the temporal table to the start time that is associated with the most recent transaction. If the same row is updated again within the same commit scope, then no record is created in the history table. Therefore, DB2 keeps only the original row. If the row is deleted, then DB2 automatically removes the data from the current table and maintains the old version(s) in the history table. DB2 sets the end time of the deleted row in the history table to the start time that is associated with the most recent transaction of the DELETE statement.


Querying system-managed temporal tables

SELECT statements without any period specifications apply only to data in the system-period temporal table, providing a performance benefit for applications requiring only current data. DB2 accesses the historical data automatically and transparently only if the FROM clause of the query includes the period specifications. The period specification mentions FOR SYSTEM_TIME to indicate that the system time is to be used. Functionally, the three supported periodic specifications are the following (also see Listing 4):

  • AS OF valueQuery the data as of a certain point in time. Returns the row in which the begin value <= value and the end value > value.
  • FROM value1 TO value2Query the data from value1 to value2. Returns the rows with the begin value < value2 and the end value > value1. Returns zero rows if value1 is greater than or equal to value2.
  • BETWEEN value1 AND value2Query the data between a range of start and end times. Returns the rows with the begin value <= value2 and the end value > value1. Returns zero rows if value1 is greater than value2.

Creating and using tables with business time

Business time involves tracking when certain business conditions are, were, or will be valid. Unlike system-managed temporal tables, there is no separate history table. You maintain past, present, and future effective dates and their associated data in a single table.

To create a table with business time, use the PERIOD BUSINESS_TIME clause (bus_start, bus_end), and include start (bus_start) and end (bus_end) columns. They can be date or timestamp data types. An implicit DB2 constraint enforces that bus_start values are less than bus_end values, ensuring temporal data integrity. To enforce that specified keys are unique with respect to the specific time period, specify the primary key constraint BUSINESS_TIME WITHOUT OVERLAPS as the last item with the list of other keys. For example, specifying PRIMARY KEY(id, BUSINESS_TIME WITHOUT OVERLAPS) in the policy table would ensure that two versions of the same policy that are valid at the same time cannot exist.

While inserting a row into a table with business time, the user must explicitly specify the start and the end date columns. By default, UPDATE statements behave traditionally for tables with business time periods. The period clause with FOR PORTION OF BUSINESS_TIME FROM value1 TO value2 can be used to apply the update to the specific business time period. The UPDATE with period clause implies that all rows between the time period from value1 to value2 must be updated.

You can delete data pertaining to specific business time periods by specifying the time range in the FOR PORTION OF BUSINESS_TIME clause. If the row to be deleted has data that is not fully contained within the specified time range, DB2 automatically performs the implicit inserts, ensuring the appropriate information is preserved.

SELECT statements without any period specifications can be used normally. Temporal queries involving business time are also easy to write—and you need only include FOR BUSINESS_TIME in the FROM clause. Similar to system-managed temporal tables, three types of queries— involving AS OF,FROM... TO..., and BETWEEN... AND...—are supported. DB2 automatically rewrites the queries with appropriate WHERE clause predicates depending on the data types used in the start and end columns of the temporal tables.

For example, multiple rows pertaining to the same policy with different coverage amounts at different time periods can be inserted into the policy table.

Table 1. Multiple rows pertaining to same policy in policy table
IDCoverageBus_startBus_end
11115000002010-01-012011-01-01
11117500002011-01-012012-01-01
11119000002012-01-019999-12-31

Trying to insert a row for the same policy with start and end dates as 2011-06-01 and 2012-01-01 would fail, as the BUSINESS_TIME_WITHOUT_OVERLAPS clause ensures that temporal uniqueness is maintained (earlier, achieving this involved significant programming effort).

UPDATE with a period clause implies that all rows between time period value1 to value2 must be updated and can be used to apply the update to the specific business time period. For example, if you try to update the coverage amount to 600000 for the time period 2010-06-01 and 2011-09-01, the resulting table would be as follows:

Table 2. Results in policy table after using UPDATE with a period clause
IDCoverageBus_startBus_end
11115000002010-01-012010-06-01
11116000002010-06-012011-01-01
11116000002011-01-012011-09-01
11117500002011-09-012012-01-01
11119000002012-01-019999-12-31

Temporal queries made easy

DB2 handles these versioning and temporal queries automatically and transparently with minimal impact on existing applications. IBM’s approach is expected to become the standard among databases, and the next edition of SQL standard will include temporal functionality. Now you can handle data version management and effective tracking of business events using this technique instead of application-level triggers and stored procedures. Use temporal tables and save yourself some time!

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=742908
ArticleTitle=Go back in time
publish-date=08012011