Convert a Universal Time Coordinated timestamp in DB2 Universal Database

Going from a UTC timestamp to a timestamp in another locale

This article presents the design of a Java™ implementation for a user-defined function for IBM® DB2® Universal Database™ (UDB) that takes a UTC timestamp and a locale name as input parameters, and returns the timestamp equivalent in the new locale. This is an especially useful function in a data warehousing environment where you have data coming in from diverse locales.

John Munteanu (jmuntean@ca.ibm.com), Data Architect, IBM Canada

John Munteanu is a Data Architect with the Business Intelligence/Data Warehousing group of IBM Canada's Global Services division.



29 July 2004

Important: Read the disclaimer before reading this article.

Introduction

This article presents the design of a Java implementation for a DB2 user-defined function (UDF) that takes as input two parameters: a UTC timestamp (for example, 2004-04-04-04.00.00.000000) and a locale name (such as "America/Guayaquil") and returns the timestamp equivalent in the new locale (in this case, 2004-04-03-23.00.00.000000).

The two main challenges for performing the conversion were:

  • The ability to support all the different names a certain locale could appear under and all the different locales that could come in as an input
  • The ability to figure out the daylight savings time rule for each locale

The real life experience of a data warehousing project with data coming in from many different applications and from locales scattered over few continents proved that such a function was invaluable for the transformation side of the Extract Transform Load (ETL) process that was needed to analyze and process the data.


Setting the stage

A unique moment in the time continuum can be defined by that moment's date, time, and, for even more precision, by the number of fractions of a second. Called a timestamp, its ISO representation in DB2 looks like this:

2004-07-24-16.18.28.410002

In this case, the precision is down to one microsecond.

Recording the timestamp when an event occurred in relationship to a transaction is called “timestamping" the transaction. Timestamping a record is done many times throughout the life of a transaction for recording the time a transaction was created, was last modified, was last accessed, and so on.

The usual design for centrally storing transactions that happen in multiple geographies and multiple locales is to timestamp each record with the Universal Time Coordinated (UTC) equivalent of the local time. By also recording the place where the transaction occurred, for example by storing the customer number or the business unit number where the transaction took place, you can reconstruct the local time of that transaction.

The need to reconstruct the local time for a UTC timestamp at the transaction's place of origin was presented to us when data from many sources had to be processed, analyzed and reported on. The typical questions that were asked were:

  • I know what the UTC time was but what was the customer's local time?
  • Was it the morning or afternoon?
  • Was it within our hours of operations we had for that geography, or after-hours?
  • And so on.

The challenges

An exercise seemingly both simple and readily available proved to be neither.

The real time conversion from the local time to UTC was usually done within the application. It was only for the current time (not for any another point in time such in the past or in the future) and only for the locale of the server the application was running on (not for any other locales).

One of the challenges we had was to make sense of the locale names as they were given to us:

  • Each data source for our data warehouse presented us with a different name for the same locale, for example Eastern Daylight, Eastern Standard Time, America/New_York, EST, and so on — all with the same meaning.
  • Each source for our data was dealing with another set of locales. For example, as one data source only had North American locales, others also had offices in Europe and as a result brought along the need for an extended list of locales

Another challenge was to implement the Daylight Savings Time (DST) rules for all the locales we were dealing with — rules to which we did not have easily access.

A very easy to use API for calling the conversion function was also one of the requirements.

To retrieve for example the "America/Nassau" local time of a UTC timestamp the SELECT statement for an imaginary Acme Intl. company will have had to look as simple as:

SELECT ACME.F_CONVERT_TIMEZONE(TRANSACTION_TIMESTAMP, "America/Nassau")
FROM ACME.TRANSACTION_TABLE;

The solution

Some of the data sources were populated by Java applications, and as a result the names used by each application for their locales were the ones found in the different Java JDK versions (1.1.8, 1.4, etc). Since a Java JDK also had all the DST rules already coded for a fairly comprehensive list of locales, we chose to write the conversion function in Java and run it on DB2's Java JDK.

For ease of use and convenience the Java class was wrapped in another DB2 UDF.

The solution details

The Java class has a class variable that stores a lookup dictionary with all the possible spellings and naming of a locale, for all the locales that could come in as input.

tz_map = new Hashtable();
	â€Â¦ 
	tz_map.put("Eastern Daylight", 		"EST");
	tz_map.put("Eastern Standard Time", 	"EST");
	tz_map.put("America/New_York", 		"EST");
	â€Â¦

For example all the above keys correspond to the value “EST". That is the value that would be used internally by the class method for the timestamp conversion.

Note: The usage of the long names for time zones such as "America/New_York" is encouraged. In this particular implementation we used the short names instead since UDB DB2 version 7.2 uses JDK 1.1.8 which could only take the short names.

The population of the lookup table was done manually. A great deal of effort was put into looking up the internal Java settings for each locale, and a long name was paired up with a short name that had the same DST rule and time zone.

Over 250 locales were mapped that way. New additions could be made as needed. That provides the flexibility we need as new data sources will be added to the data warehouse along with new locales.

For the actual timestamp conversion the following class method was used:

public static java.lang.String J_CONVERT_TIMEZONE(java.lang.String 
ivc_UTCtimestamp, java.lang.String ivc_timezone)

The input timestamp string is first parsed into its components, a Java calendar is instantiated from those values, and then using a formatter, a new converted timestamp is produced. The microseconds are just passed along with no conversion, since a Java Calendar does not have that precision.

The following statement can be used to register the Java class method as a UDF function:

  public static java.lang.String J_CONVERT_TIMEZONE
(java.lang.String ivc_UTCtimestamp, 
 java.lang.String ivc_timezone) 
throws Exception
    {
        // get the short name equivalent of the input
        ivc_timezone = (String)tz_map.get(ivc_timezone);
        if (ivc_timezone == null)
            ivc_timezone = "GMT"; // default to UTC if entry not found

        // replace the . with - so that we only have one token separator instead of two
        String ivc_UTCtimestamp_new = ivc_UTCtimestamp.replace('.', '-' );

        // parse, validate and convert the TS string to integers, based on the one
        // separator
        StringTokenizer st = new StringTokenizer(ivc_UTCtimestamp_new, "-");
        int year = Integer.parseInt(st.nextToken());
        int month = Integer.parseInt(st.nextToken());
        int day = Integer.parseInt(st.nextToken());
        int hour = Integer.parseInt(st.nextToken());
        int min = Integer.parseInt(st.nextToken());
        int sec = Integer.parseInt(st.nextToken());
        String micro = st.nextToken(); // just carried over from the input

        // create with the above a calendar in UTC
        Calendar calUTC = Calendar.getInstance();
        calUTC.clear();
        calUTC.setTimeZone(TimeZone.getTimeZone("GMT"));
        calUTC.set(year, month-1, day, hour, min, sec );

        // prepare the formatter for the specified timezone
        DateFormat formatter = new 
            SimpleDateFormat("yyyy'-'MM'-'dd'-'HH.mm.ss", Locale.US);
        TimeZone tz = TimeZone.getTimeZone(ivc_timezone);
        formatter.setTimeZone(tz);

        // return the new value
        return formatter.format(calUTC.getTime()) + "." + micro;
    }

The above DB2 UDF could be called as from SQL, but for convenience another DB2 UDF was created to convert the input from a DB2 timestamp to a string, and the output from a string back to a DB2 function, so that both the input and the output are DB2-compatible timestamps. The code follows:

CREATE FUNCTION ACME.F_CONVERT_TIMEZONE (
        IPTS_TIMESTAMP  TIMESTAMP,
        IPCH_TIMEZONE   VARCHAR(30))
RETURNS TIMESTAMP

BEGIN ATOMIC

   DECLARE vvch_result VARCHAR(30);

   SET vvch_result = j_convert_timezone(char(IPTS_TIMESTAMP), rtrim(IPCH_TIMEZONE));

   RETURN CASE vvch_result
      WHEN 'null' THEN NULL
      ELSE timestamp(vvch_result)
   END;

END

As a result the function could be called as in the following simple SQL statement:

SELECT ACME.F_CONVERT_TIMEZONE(TRANSACTION_TIMESTAMP, "America/Nassau")
FROM ACME.TRANSACTION_TABLE;

Future improvements and variations

The following are changes that could easily be added to enhance this solution:

  • Add new locales (by extending the lookup dictionary).
  • Add new synonyms to a locale (by extending the lookup dictionary).
  • As higher versions of JDKs are available, you could start using the long names for internal conversions (update the Java code).
  • Create a reverse function — one that will take a timestamp for a certain locale and return the equivalent UTC timestamp.

Note: UTC is the increasingly preferred synonym for Greenwich Mean Time (GMT).


Conclusion

A function that converts a timestamp from one locale to another could prove to be a staple in the transformation side commonly found in the ETL process of a data warehouse project. This article provides the code for such a Java function that runs as a UDB DB2 UDF.


Acknowledgements

I would like to thank Paul Yip, database consultant with the IBM Toronto Lab, for the much valued help and design suggestions

From IBM Canada's Global Services, I wish to acknowledge James Liu for the detailed manual mapping of the lookup dictionary and a very thorough border testing of the function, and, last but not least, Jammie Lee, for coordinating the team effort and for the positive feedback to move it forward and bring it to fruition.

Disclaimer

This article contains sample code. IBM grants you ("Licensee") a non-exclusive, royalty free, license to use this sample code. However, the sample code is provided as-is and without any warranties, whether EXPRESS OR IMPLIED, INCLUDING ANY IMPLIED WARRANTY OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR NON-INFRINGEMENT. IBM AND ITS LICENSORS SHALL NOT BE LIABLE FOR ANY DAMAGES SUFFERED BY LICENSEE THAT RESULT FROM YOUR USE OF THE SOFTWARE. IN NO EVENT WILL IBM OR ITS LICENSORS BE LIABLE FOR ANY LOST REVENUE, PROFIT OR DATA, OR FOR DIRECT, INDIRECT, SPECIAL, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF THE USE OF OR INABILITY TO USE SOFTWARE, EVEN IF IBM HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.


Download

DescriptionNameSize
Code sampletimezone.zip  ( HTTP | FTP )5.48 KB

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, IBM i
ArticleID=14614
ArticleTitle=Convert a Universal Time Coordinated timestamp in DB2 Universal Database
publish-date=07292004