Skip to main content

A Primer on Moving Data In and Out of DB2 Tables Defined with Generated Columns

Paul C. Zikopoulos (paulz_ibm@msn.com), Senior Specialist, Database Competitive Technology, SDI Corp.
Paul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technologies team. He has more than nine years of experience with DB2 and has written numerous magazine articles and books about it. Paul has co-authored the following books: DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). Currently he is writing a book on the Apache Derby/IBM Derby database. You can reach him at: paulz_ibm@msn.com.

Summary:  Generated columns, including identity columns, are a great feature in DB2 for automatically generating column values. Author Paul Zikopoulos help you get a handle on your options for moving that data around.

Date:  13 Feb 2003
Level:  Introductory
Activity:  1087 views
Comments:  

Introduction

Generated table columns are a handy feature of IBM® DB2® Universal DatabaseTM. A generated column derives its values from an expression, rather than from INSERT or UPDATE operations.


Generated columns and identity columns

Consider the following data definition language (DDL) that defines a table in a database:



 
db2 create table db2admin.actor ( 
    actor_id int generated by default as identity ,  
    actor_name varchar(20) ,  
    act_yr_of_birth smallint , 
    act_yr_of_death smallint , 
    age_at_death smallint generated always as (act_yr_of_death - act_yr_of_birth)  
   ) 
   in userspace1 
 
db2 alter table db2admin.actor 
    add primary key (actor_id) 

This DDL contains a simple example of a generated column. The AGE_AT_DEATH column is automatically generated by DB2 using the expression:

 
ACT_YR_OF_DEATH - ACT_YR_OF_BIRTH = AGE_AT_DEATH 

You can create a generated column in a table by specifying the GENERATED ALWAYS AS ... clause for a column in a CREATE TABLE or an ALTER TABLE statement.

My example also includes the definition of a special type of generated column, called an identity column. An identity column contains numeric values that can be automatically generated by DB2. Identity columns can be really useful for creating primary key values. In fact, this example includes an ALTER TABLE statement that defines the ACTOR_ID column as the primary key for the ACTOR table.

Always generated by DB2: An identity column defined as GENERATED ALWAYS can be guaranteed by DB2 to be unique because its values are always generated by DB2.

I used the word can because many users misunderstand this uniqueness guarantee. This may or may not be true, depending on how you define the attributes of the identity column. For example, assume that you've defined an identity column that cycles between a minimum value of 1 and a maximum value of 10: unless this table contains very few rows (10 or fewer), the value in this column will not be unique, by definition.

The way to guarantee the uniqueness of column values is to define a unique constraint on the column. The reason my actor_id column can be guaranteed to be unique is because I altered the table and added a primary key constraint on that column. If you are wondering why I chose to use an identity column, it's because you can generate numbers without having the concurrency issues involved with setting up a table counter.

User-supplied identities: An identity column defined as GENERATED BY DEFAULT will accept values supplied by applications. If an application doesn't supply an identity value to a table with a column of this type, DB2 generates those values for you (but it cannot guarantee the uniqueness of identity values unless a unique constraint has been defined for the table - as is the case with my example).

In my actor table, I defined the actor_id column as GENERATED BY DEFAULT, which means I can, if I choose, insert my own actor IDs. For example, the following INSERT statement adds three new rows to the ACTOR table:

 
db2 insert into db2admin.actor (actor_id, actor_name, act_yr_of_birth) 
    values (150,'Bruce Willis',1955),  
           (default,'Tom Cruise',1962), 
           (default,'Tommy Lee Jones',1946)

In the example above, an identity value (in this case 150) is specified for the first row, and the DEFAULT keyword is used to specify that DB2 should generate identity values for the remaining rows. After running this INSERT statement, you can see the DB2-generated values for Tom Cruise and Tommy Lee Jones, and the user-supplied value of 150 for Bruce Willis. You can verify this by looking at the contents of the ACTOR table as shown in Figure 1.


Figure 1. Contents of the ACTOR table
Contents of the ACTOR table

If I instead had defined the actor_id column as GENERATED ALWAYS (instead of GENERATED BY DEFAULT), the INSERT statement in this example would have had to specify the DEFAULT keyword instead of the value 150 to complete successfully; otherwise, the application would receive an SQL00798 error.

Try this for yourself. Create a new table called ACTOR2, using the same DDL I gave you for the ACTOR table, with the changes in bold below:

 
db2 create table db2admin.actor2 ( 
     actor_id int generated always as identity ,  
     actor_name varchar(20) ,  
     act_yr_of_birth smallint , 
     act_yr_of_death smallint , 
     age_at_death smallint generated always as (act_yr_of_death - act_yr_of_birth)  
    ) 
    in userspace1 
 
db2 alter table db2admin.actor2 
    add primary key (actor_id)

Now run the same INSERT statements that you used for the ACTOR table on the ACTOR2 table (don't forget the change: ...insert into db2admin.actor2...). What happened? You received an SQL00798N error.

Now run the same INSERT statement with the changes in bold below:

 
db2 insert into db2admin.actor2 (actor_id, actor_name, act_yr_of_birth) 
    values (default,'Bruce Willis',1955),  
           (default,'Tom Cruise',1962), 
           (default,'Tommy Lee Jones',1946)

Success! Why? Because the ACTOR2 table was created with the option that DB2 must generate the identity values. Now if you sample the contents of the ACTOR2 table, it looks like Figure 2.


Figure 2. Contents of ACTOR2 table
Contents of ACTOR2 table

You should now have two tables:

  • ACTOR (defined with the GENERATED BY DEFAULT option)
  • ACTOR2 (defined with the GENERATED ALWAYS option).

Importing, exporting, and loading identity columns

Import and export

Because the Integrated Exchange Format (IXF) format preserves identity column properties, you can use the DB2 EXPORT utility to export data from a table that contains an identity column. The IXF file format can be used by the DB2 IMPORT utility to recreate the table.

For example, to generate an IXF file for the ACTOR table (remember, this has an identity column that is GENERATED BY DEFAULT), enter the following command:

 
db2 export to actor.ixf of ixf messages export.msg select * from db2admin.actor

Now use the following command to import the data and create a new table called ACTOR3, in a single statement, by entering the following command:

 
db2 import from actor.ixf of ixf modified by identityignore messages import.msg 
            create into db2admin.actor3

Now sample the contents of the newly created ACTOR3 table; it should look like Figure 3.


Figure 3. Contents of ACTOR3 table
Contents of ACTOR3 table

Do the contents of the ACTOR3 table surprise you? You exported the contents of the ACTOR table into an IXF file and used it to create the ACTOR3 table. However, the contents of the ACTOR3 table are the same as the ACTOR2 table.

In the example I used, the exported data is imported into a newly created table called ACTOR3, which has the same properties as the ACTOR table. However, this particular command invocation included a modifier called identityignore.

The IMPORT utility supports two file type modifiers associated with identity columns:

  • identityignore - This modifier specifies that any identity values in the input file be ignored, and that new identity values are to be generated by DB2 for each row.
  • identitymissing - This modifier specifies that the input file does not contain any values, not even null values, for the identity column in the target table. (Note: If you try to import an IXF file that has an identity column in it, but you have told the IMPORT utility that it doesn't - by using the identitymissing modifier - your results will not be what you expected.)

Two other file type modifiers -- generatedignore and generatedmissing -- have identical roles that apply to imported generated columns.

In my example, if the identityignore modifier had not been specified, and the target table definition had included a GENERATED ALWAYS identity column, all of the input data in the IXF file would have been rejected because DB2 has been told to always generate the identity values. (If you recall, the DDL I gave you for the ACTOR2 table used a GENERATED ALWAYS column.)

An identity value for such a column is generated whenever the corresponding row in the input file is missing a value (or contains the null value) for the identity column. If a non-null value is specified for the identity column, the row is rejected. On the other hand, if the identityignore modifier is not specified and the identity column is generated by default, the IMPORT utility accepts the values that are contained in the input file; if such values are missing or are null, DB2 generates new values.

Now, I'll bet you are going to read that paragraph again and say to yourself, "huh?" Table 1 summarizes these conditions for the most commonly used IXF IMPORT options: INSERT, CREATE and REPLACE. (I will let you explore conditions using the identitymissing modifier; just remember not to include the identity column when you export your data to IXF, or there would be no reason to use this modifier.)

Generating the export files: All of the examples below are done with the IXF files generated from the DDL I gave you earlier in this article. All EXPORT and IMPORT commands are run from the CLP. If you use the Control Center to generate your export files, and use the Column tab to select all of the table's columns, the Control Center will include the METHOD N parameter for each of the columns as shown in Figure 4.


Figure 4. Generated export statement with METHOD N parameter
Generated export statement with METHOD N parameter

This is not what you want, because the METHOD N parameter excludes all of the index and identity column information from the IXF file, so these samples will fail. You must generate the IXF file without this option.

The default for EXPORT is to use ...select * from ... to select the data - so just don't use the Column tab and all will work fine. Compare the results of the Show Command button when I didn't specify the columns explicitly (Figure 5) with the results when I did (Figure 4):


Figure 5. Generated export command
Generated export command

For your convenience, if you're following along, here are the export files for the actor and actor2 tables.

Table 1. Summary of import results for identity columns

IMPORT
OPTION
with IDENTITYIGNOREwithout IDENTITYIGNORE
Tables with GENERATED BY DEFAULT identity columns
(examples use the IXF file that was generated from the ACTOR table)
CREATEThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR IXF file - notice the ACTOR_ID value for Bruce Willis? fig6 The identity values in the IXF file are used. If there are nulls in the IXF file, DB2 will generate values for these rows. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR IXF file - notice the ACTOR_ID value for Bruce Willis?
fig7
INSERTThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the identity values in the IXF file already exist in the table, processing continues as normal (even if the column is defined as a PK) since DB2 generates new identity values for these rows. The example below shows an IMPORT INSERT with the ACTOR IXF file after the IMPORT CREATE to illustrate this point - notice all rows were accepted and have DB2 generated identity values? fig 8 The values in the IXF file are used. If there is a uniqueness violation, the values in the IXF file are rejected. If there are nulls in the IXF file, DB2 will generate values for these rows.The example below shows an IMPORT INSERT with the ACTOR IXF file after IMPORT CREATE to illustrate this point - notice all the rows were rejected? This is because the ACTOR_ID column is a primary key to this table. fig 9 If you altered the table such that there wasn't a primary key defined on it, the results would look like: fig 10
REPLACEAll the non-identity values in the table are replaced with the contents of the IXF file. Identity values are generated by DB2 and resume at the next sequence point where the identity values that were in the table originally left off. The example below shows an IMPORT REPLACE with the ACTOR IXF file after running IMPORT CREATE and IMPORT INSERT to illustrate this point - notice the ACTOR_ID for the three rows? fig 11 All the values in the table are replaced with the contents of the IXF file. If there are nulls in the IXF file, DB2 will generate values for these rows.The example below shows an IMPORT REPLACE with the ACTOR IXF file after IMPORT CREATE and IMPORT INSERT to illustrate this point.
fig12
Tables with GENERATED ALWAYS identity columns
(examples use the IXF file that was generated from the ACTOR2 table)
CREATEThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR2 IXF file. fig13 The identity values in the IXF file are ignored since DB2 must generate these values. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR2 IXF. fig14
INSERTThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the identity values in the IXF file already exist in the table, processing continues as normal since DB2 generates new identity values for these rows. The example below shows an IMPORT INSERT with the ACTOR2 IXF file after the IMPORT CREATE to illustrate this point - notice all the rows were accepted and have DB2 generated identity values? fig 15 The values in the IXF file are used. If there is a uniqueness violation, the values in the IXF file are rejected. The example below shows an IMPORT INSERT with the ACTOR2 IXF file after the IMPORT CREATE to illustrate this point - notice all the rows were rejected? This is because the ACTOR_ID column is a primary key to this table.
fig 16 If you altered the table such that there wasn't a PK defined on it, the results would look like: fig 17
REPLACEAll the non-identity values in the table are replaced with the contents of the IXF file. Identity values are generated by DB2, and resume at the next sequence point where the identity values that were in the table originally left off. The example below shows an IMPORT REPLACE with the ACTOR2 IXF file after running IMPORT CREATE and IMPORT INSERT to illustrate this point - notice the ACTOR_ID for the three rows? fig 18 All the values in the table are replaced with the contents of the IXF file. The example below shows an IMPORT REPLACE with the ACTOR2 IXF file after IMPORT CREATE and IMPORT INSERT to illustrate this point. fig 19

The LOAD utility

The DB2 LOAD utility supports three file type modifiers associated with identity columns. In addition to identityignore and identitymissing, the load utility accepts the identityoverride modifier.

The identityoverride modifier specifies that identity values in the input file be used when loading data into a table that has a GENERATED ALWAYS identity column. When this modifier is specified, any rows without values (or null values) for the identity column are rejected. If the Identity column is not the primary key, or a unique index has not been defined on the Identity column, it is possible to violate the uniqueness property of GENERATED ALWAYS columns when this modifier is used.

Three other file type modifiers -- generatedignore, generatedmissing, and generatedoverride -- have identical roles that apply to loaded generated columns.


For DB2 Version 7.2 users

In DB2 Version 7.2, LOAD cannot generate non-identity generated column values, so loading a table with non-identity generated columns will leave the target table in CHECK PENDING state until you run SET INTEGRITY.

If you run SET INTEGRITY FOR <tablename> IMMEDIATE CHECKED FORCE GENERATED then SET INTEGRITY generates the column values for you. If you specify generateoverride, and then you provide your own values that you know to be valid, you can take the table out of CHECK PENDING state using the following command:

 
SET INTEGRITY FOR <tablename> GENERATED COLUMN IMMEDIATE UNCHECKED  

Entering the SET INTEGRITY command with this option causes DB2 to skip the work of verifying that the values supplied are actually correct (which can be risky, of course). To take the table out of check pending state and force verification of the user-supplied values, issue the following command: SET INTEGRITY FOR <tablename> IMMEDIATE CHECKED.



Downloads

DescriptionNameSizeDownload method
actor.ixf7 KB FTP | HTTP
Download the sample code:actor2.ixf7 KB FTP | HTTP

Information about download methods


Resources

About the author

Paul C. Zikopoulos

Paul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technologies team. He has more than nine years of experience with DB2 and has written numerous magazine articles and books about it. Paul has co-authored the following books: DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). Currently he is writing a book on the Apache Derby/IBM Derby database. You can reach him at: paulz_ibm@msn.com.

Comments



Trademarks  |  My developerWorks terms and conditions

Help: Update or add to My dW interests

What's this?

This little timesaver lets you update your My developerWorks profile with just one click! The general subject of this content (AIX and UNIX, Information Management, Lotus, Rational, Tivoli, WebSphere, Java, Linux, Open source, SOA and Web services, Web development, or XML) will be added to the interests section of your profile, if it's not there already. You only need to be logged in to My developerWorks.

And what's the point of adding your interests to your profile? That's how you find other users with the same interests as yours, and see what they're reading and contributing to the community. Your interests also help us recommend relevant developerWorks content to you.

View your My developerWorks profile

Return from help

Help: Remove from My dW interests

What's this?

Removing this interest does not alter your profile, but rather removes this piece of content from a list of all content for which you've indicated interest. In a future enhancement to My developerWorks, you'll be able to see a record of that content.

View your My developerWorks profile

Return from help

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=13335
ArticleTitle=A Primer on Moving Data In and Out of DB2 Tables Defined with Generated Columns
publish-date=02132003
author1-email=paulz_ibm@msn.com
author1-email-cc=

My developerWorks community

Tags

Help
Use the search field to find all types of content in My developerWorks with that tag.

Use the slider bar to see more or fewer tags.

Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere).

My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Use the search field to find all types of content in My developerWorks with that tag. Popular tags shows the top tags for this particular content zone (for example, Java technology, Linux, WebSphere). My tags shows your tags for this particular content zone (for example, Java technology, Linux, WebSphere).

Special offers