A Primer on Moving Data In and Out of DB2 Tables Defined with Generated Columns

Generated columns, including identity columns, are a great feature in DB2 for automatically generating column values. Author Paul Zikopoulos help you get a handle on your options for moving that data around.

Paul C. Zikopoulos (paulz_ibm@msn.com), Senior Specialist, Database Competitive Technology, IBM

Paul C. ZikopoulosPaul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technologies team. He has more than nine years of experience with DB2 and has written numerous magazine articles and books about it. Paul has co-authored the following books: DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). Currently he is writing a book on the Apache Derby/IBM Derby database. You can reach him at: paulz_ibm@msn.com.



13 February 2003

Also available in Japanese

Introduction

Generated table columns are a handy feature of IBM® DB2® Universal DatabaseTM. A generated column derives its values from an expression, rather than from INSERT or UPDATE operations.


Generated columns and identity columns

Consider the following data definition language (DDL) that defines a table in a database:

db2 create table db2admin.actor ( 
    actor_id int generated by default as identity ,  
    actor_name varchar(20) ,  
    act_yr_of_birth smallint , 
    act_yr_of_death smallint , 
    age_at_death smallint generated always as (act_yr_of_death - act_yr_of_birth)  
   ) 
   in userspace1 
 
db2 alter table db2admin.actor 
    add primary key (actor_id)

This DDL contains a simple example of a generated column. The AGE_AT_DEATH column is automatically generated by DB2 using the expression:

ACT_YR_OF_DEATH - ACT_YR_OF_BIRTH = AGE_AT_DEATH

You can create a generated column in a table by specifying the GENERATED ALWAYS AS ... clause for a column in a CREATE TABLE or an ALTER TABLE statement.

My example also includes the definition of a special type of generated column, called an identity column. An identity column contains numeric values that can be automatically generated by DB2. Identity columns can be really useful for creating primary key values. In fact, this example includes an ALTER TABLE statement that defines the ACTOR_ID column as the primary key for the ACTOR table.

Always generated by DB2: An identity column defined as GENERATED ALWAYS can be guaranteed by DB2 to be unique because its values are always generated by DB2.

I used the word can because many users misunderstand this uniqueness guarantee. This may or may not be true, depending on how you define the attributes of the identity column. For example, assume that you've defined an identity column that cycles between a minimum value of 1 and a maximum value of 10: unless this table contains very few rows (10 or fewer), the value in this column will not be unique, by definition.

The way to guarantee the uniqueness of column values is to define a unique constraint on the column. The reason my actor_id column can be guaranteed to be unique is because I altered the table and added a primary key constraint on that column. If you are wondering why I chose to use an identity column, it's because you can generate numbers without having the concurrency issues involved with setting up a table counter.

User-supplied identities: An identity column defined as GENERATED BY DEFAULT will accept values supplied by applications. If an application doesn't supply an identity value to a table with a column of this type, DB2 generates those values for you (but it cannot guarantee the uniqueness of identity values unless a unique constraint has been defined for the table - as is the case with my example).

In my actor table, I defined the actor_id column as GENERATED BY DEFAULT, which means I can, if I choose, insert my own actor IDs. For example, the following INSERT statement adds three new rows to the ACTOR table:

db2 insert into db2admin.actor (actor_id, actor_name, act_yr_of_birth) 
    values (150,'Bruce Willis',1955),  
           (default,'Tom Cruise',1962), 
           (default,'Tommy Lee Jones',1946)

In the example above, an identity value (in this case 150) is specified for the first row, and the DEFAULT keyword is used to specify that DB2 should generate identity values for the remaining rows. After running this INSERT statement, you can see the DB2-generated values for Tom Cruise and Tommy Lee Jones, and the user-supplied value of 150 for Bruce Willis. You can verify this by looking at the contents of the ACTOR table as shown in Figure 1.

Figure 1. Contents of the ACTOR table
Contents of the ACTOR table

If I instead had defined the actor_id column as GENERATED ALWAYS (instead of GENERATED BY DEFAULT), the INSERT statement in this example would have had to specify the DEFAULT keyword instead of the value 150 to complete successfully; otherwise, the application would receive an SQL00798 error.

Try this for yourself. Create a new table called ACTOR2, using the same DDL I gave you for the ACTOR table, with the changes in bold below:

db2 create table db2admin.actor2 ( 
     actor_id int generated always as identity ,  
     actor_name varchar(20) ,  
     act_yr_of_birth smallint , 
     act_yr_of_death smallint , 
     age_at_death smallint generated always as (act_yr_of_death - act_yr_of_birth)  
    ) 
    in userspace1 
 
db2 alter table db2admin.actor2 
    add primary key (actor_id)

Now run the same INSERT statements that you used for the ACTOR table on the ACTOR2 table (don't forget the change: ...insert into db2admin.actor2...). What happened? You received an SQL00798N error.

Now run the same INSERT statement with the changes in bold below:

db2 insert into db2admin.actor2 (actor_id, actor_name, act_yr_of_birth) 
    values (default,'Bruce Willis',1955),  
           (default,'Tom Cruise',1962), 
           (default,'Tommy Lee Jones',1946)

Success! Why? Because the ACTOR2 table was created with the option that DB2 must generate the identity values. Now if you sample the contents of the ACTOR2 table, it looks like Figure 2.

Figure 2. Contents of ACTOR2 table
Contents of ACTOR2 table

You should now have two tables:

  • ACTOR (defined with the GENERATED BY DEFAULT option)
  • ACTOR2 (defined with the GENERATED ALWAYS option).

Importing, exporting, and loading identity columns

Import and export

Because the Integrated Exchange Format (IXF) format preserves identity column properties, you can use the DB2 EXPORT utility to export data from a table that contains an identity column. The IXF file format can be used by the DB2 IMPORT utility to recreate the table.

For example, to generate an IXF file for the ACTOR table (remember, this has an identity column that is GENERATED BY DEFAULT), enter the following command:

db2 export to actor.ixf of ixf messages export.msg select * from db2admin.actor

Now use the following command to import the data and create a new table called ACTOR3, in a single statement, by entering the following command:

db2 import from actor.ixf of ixf modified by identityignore messages import.msg 
            create into db2admin.actor3

Now sample the contents of the newly created ACTOR3 table; it should look like Figure 3.

Figure 3. Contents of ACTOR3 table
Contents of ACTOR3 table

Do the contents of the ACTOR3 table surprise you? You exported the contents of the ACTOR table into an IXF file and used it to create the ACTOR3 table. However, the contents of the ACTOR3 table are the same as the ACTOR2 table.

In the example I used, the exported data is imported into a newly created table called ACTOR3, which has the same properties as the ACTOR table. However, this particular command invocation included a modifier called identityignore.

The IMPORT utility supports two file type modifiers associated with identity columns:

  • identityignore - This modifier specifies that any identity values in the input file be ignored, and that new identity values are to be generated by DB2 for each row.
  • identitymissing - This modifier specifies that the input file does not contain any values, not even null values, for the identity column in the target table. (Note: If you try to import an IXF file that has an identity column in it, but you have told the IMPORT utility that it doesn't - by using the identitymissing modifier - your results will not be what you expected.)

Two other file type modifiers -- generatedignore and generatedmissing -- have identical roles that apply to imported generated columns.

In my example, if the identityignore modifier had not been specified, and the target table definition had included a GENERATED ALWAYS identity column, all of the input data in the IXF file would have been rejected because DB2 has been told to always generate the identity values. (If you recall, the DDL I gave you for the ACTOR2 table used a GENERATED ALWAYS column.)

An identity value for such a column is generated whenever the corresponding row in the input file is missing a value (or contains the null value) for the identity column. If a non-null value is specified for the identity column, the row is rejected. On the other hand, if the identityignore modifier is not specified and the identity column is generated by default, the IMPORT utility accepts the values that are contained in the input file; if such values are missing or are null, DB2 generates new values.

Now, I'll bet you are going to read that paragraph again and say to yourself, "huh?" Table 1 summarizes these conditions for the most commonly used IXF IMPORT options: INSERT, CREATE and REPLACE. (I will let you explore conditions using the identitymissing modifier; just remember not to include the identity column when you export your data to IXF, or there would be no reason to use this modifier.)

Generating the export files: All of the examples below are done with the IXF files generated from the DDL I gave you earlier in this article. All EXPORT and IMPORT commands are run from the CLP. If you use the Control Center to generate your export files, and use the Column tab to select all of the table's columns, the Control Center will include the METHOD N parameter for each of the columns as shown in Figure 4.

Figure 4. Generated export statement with METHOD N parameter
Generated export statement with METHOD N parameter

This is not what you want, because the METHOD N parameter excludes all of the index and identity column information from the IXF file, so these samples will fail. You must generate the IXF file without this option.

The default for EXPORT is to use ...select * from ... to select the data - so just don't use the Column tab and all will work fine. Compare the results of the Show Command button when I didn't specify the columns explicitly (Figure 5) with the results when I did (Figure 4):

Figure 5. Generated export command
Generated export command

For your convenience, if you're following along, here are the export files for the actor and actor2 tables.

Table 1. Summary of import results for identity columns

IMPORT
OPTION
with IDENTITYIGNOREwithout IDENTITYIGNORE
Tables with GENERATED BY DEFAULT identity columns
(examples use the IXF file that was generated from the ACTOR table)
CREATEThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR IXF file - notice the ACTOR_ID value for Bruce Willis? fig6The identity values in the IXF file are used. If there are nulls in the IXF file, DB2 will generate values for these rows. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR IXF file - notice the ACTOR_ID value for Bruce Willis?
fig7
INSERTThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the identity values in the IXF file already exist in the table, processing continues as normal (even if the column is defined as a PK) since DB2 generates new identity values for these rows. The example below shows an IMPORT INSERT with the ACTOR IXF file after the IMPORT CREATE to illustrate this point - notice all rows were accepted and have DB2 generated identity values? fig 8The values in the IXF file are used. If there is a uniqueness violation, the values in the IXF file are rejected. If there are nulls in the IXF file, DB2 will generate values for these rows.The example below shows an IMPORT INSERT with the ACTOR IXF file after IMPORT CREATE to illustrate this point - notice all the rows were rejected? This is because the ACTOR_ID column is a primary key to this table. fig 9 If you altered the table such that there wasn't a primary key defined on it, the results would look like: fig 10
REPLACEAll the non-identity values in the table are replaced with the contents of the IXF file. Identity values are generated by DB2 and resume at the next sequence point where the identity values that were in the table originally left off. The example below shows an IMPORT REPLACE with the ACTOR IXF file after running IMPORT CREATE and IMPORT INSERT to illustrate this point - notice the ACTOR_ID for the three rows? fig 11All the values in the table are replaced with the contents of the IXF file. If there are nulls in the IXF file, DB2 will generate values for these rows.The example below shows an IMPORT REPLACE with the ACTOR IXF file after IMPORT CREATE and IMPORT INSERT to illustrate this point.
fig12
Tables with GENERATED ALWAYS identity columns
(examples use the IXF file that was generated from the ACTOR2 table)
CREATEThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR2 IXF file. fig13The identity values in the IXF file are ignored since DB2 must generate these values. If the table already exists, the entire process is stopped. The example below shows an IMPORT CREATE with the ACTOR2 IXF. fig14
INSERTThe identity values in the IXF file are ignored and new identity values are generated by DB2. If the identity values in the IXF file already exist in the table, processing continues as normal since DB2 generates new identity values for these rows. The example below shows an IMPORT INSERT with the ACTOR2 IXF file after the IMPORT CREATE to illustrate this point - notice all the rows were accepted and have DB2 generated identity values? fig 15The values in the IXF file are used. If there is a uniqueness violation, the values in the IXF file are rejected. The example below shows an IMPORT INSERT with the ACTOR2 IXF file after the IMPORT CREATE to illustrate this point - notice all the rows were rejected? This is because the ACTOR_ID column is a primary key to this table.
fig 16 If you altered the table such that there wasn't a PK defined on it, the results would look like: fig 17
REPLACEAll the non-identity values in the table are replaced with the contents of the IXF file. Identity values are generated by DB2, and resume at the next sequence point where the identity values that were in the table originally left off. The example below shows an IMPORT REPLACE with the ACTOR2 IXF file after running IMPORT CREATE and IMPORT INSERT to illustrate this point - notice the ACTOR_ID for the three rows? fig 18All the values in the table are replaced with the contents of the IXF file. The example below shows an IMPORT REPLACE with the ACTOR2 IXF file after IMPORT CREATE and IMPORT INSERT to illustrate this point. fig 19

The LOAD utility

The DB2 LOAD utility supports three file type modifiers associated with identity columns. In addition to identityignore and identitymissing, the load utility accepts the identityoverride modifier.

The identityoverride modifier specifies that identity values in the input file be used when loading data into a table that has a GENERATED ALWAYS identity column. When this modifier is specified, any rows without values (or null values) for the identity column are rejected. If the Identity column is not the primary key, or a unique index has not been defined on the Identity column, it is possible to violate the uniqueness property of GENERATED ALWAYS columns when this modifier is used.

Three other file type modifiers -- generatedignore, generatedmissing, and generatedoverride -- have identical roles that apply to loaded generated columns.


For DB2 Version 7.2 users

In DB2 Version 7.2, LOAD cannot generate non-identity generated column values, so loading a table with non-identity generated columns will leave the target table in CHECK PENDING state until you run SET INTEGRITY.

If you run SET INTEGRITY FOR <tablename> IMMEDIATE CHECKED FORCE GENERATED then SET INTEGRITY generates the column values for you. If you specify generateoverride, and then you provide your own values that you know to be valid, you can take the table out of CHECK PENDING state using the following command:

SET INTEGRITY FOR <tablename> GENERATED COLUMN IMMEDIATE UNCHECKED

Entering the SET INTEGRITY command with this option causes DB2 to skip the work of verifying that the values supplied are actually correct (which can be risky, of course). To take the table out of check pending state and force verification of the user-supplied values, issue the following command: SET INTEGRITY FOR <tablename> IMMEDIATE CHECKED.


Downloads

DescriptionNameSize
Code sampleactor.ixf  ( HTTP | FTP )7 KB
Download the sample code:actor2.ixf  ( HTTP | FTP )7 KB

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=13335
ArticleTitle=A Primer on Moving Data In and Out of DB2 Tables Defined with Generated Columns
publish-date=02132003