Get started with IBM DB2 Test Database Generator

Create realistic test data for your database application development projects

Need sample data to test the application you're developing? Learn how you can use IBM® DB2® Test Database Generator to generate a variety of types of test data, including SQL, comma separated value (CSV), and XML formats. This article shows how to install, configure, and use this tool.

Share:

Sami Salkosuo (sami.salkosuo@fi.ibm.com), Software Architect, IBM

author photoSami Salkosuo has worked in IBM Finland since 1999. His main area of interest is Java programming. He is a Sun Certified Java Programmer, IBM Certified Solution Developer for XML and Related Technologies, and IBM Certified Solution Developer for IBM WebSphere Portal. In addition to Java, he also has experience with Python, Fortran, LabVIEW, Visual Basic, LISP, Perl and PHP. You can reach Sami at sami.salkosuo@fi.ibm.com.


developerWorks Contributing author
        level

07 June 2007

Introduction

Almost all developers have written database applications where test data was needed in order to be able to check the validity of their applications. A database that is defined, but does not yet contain data, is not really that useful for making sure your application will work as designed. In some cases, you may have access to test data or an event snapshot of production data. However, many times you may have a limited set of test data, or perhaps you don't have access to data at all. Writing test data generators or scripts is time consuming and can be very much out of the scope of the task at hand.

Luckily, IBM DB2 Test Database Generator makes test data generation relatively easy and painless. IBM DB2 Test Database Generator is used to generate test data from scratch or from existing data. Test data can be generated in a variety of formats, including SQL, CSV, or XML.

DB2 Test Database Generator is an add-on tool for DB2 that helps developers and administrators quickly create test data from scratch or from existing data. It's main features are:

  • Test data generation
  • Rule-based transformations
  • XML configuration
  • Referential integrity support
  • Batch creation of test data

Generate test data

You can generate test data from scratch or from existing data. DB2 Test Database Generator can have sources and corresponding targets that specify where original data is located and where to store generated data.

Sources can be database tables in existing database or files, such as XML or CSV, where data is located. Using such data sources, administrators or developers can copy existing data or sample existing data for testing and development purposes.

Targets are destinations for the generated test data. Targets can be: SQL, XML, delimited text, fixed-width text, and DB2 (direct to a DB2 database). If you are using z/OS, DB2 Load format is also available. Each target corresponds to a source, except when you are creating test data from scratch. See the section Create test data for an example of creating test data from scratch.

Figure 1 shows the process of creating test data.

Figure 1. Creating test data
Creating test data

When you are generating test data from scratch, there are no sources, and test data is generated using generation rules.

Rule-based transformations

Rules are methods that specify how test data is created. When using the tool without a data source, rules are used to generate test data from scratch. Using rules together with a data source, you can specify which target (column) gets the data (from source column).

There are several different types of rules that can be used when creating test data:

  • Static rule: Constant value for all columns.
  • Sourcecol rule: Defines a column in the source data that is used for the target value.
  • Lookup rule: Retrieve values for the source, based on a column search on the current value.
  • Mask rule: Modifies values by replacing positions within a value with a pattern value or a static value.
  • Expression rule: Defines a formula to be used to dynamically calculate the value of the column.
  • Random rule: Generates random data of a specified type within certain bounds.
  • Pattern rule: Specifies a pattern to be used to generate a value. Patterns can be used for character generation or string selector.
  • User-defined rule: Define your own rules by extending abstract Java class com.ibm.db2.gri.server.userFunction.UserFuction.

Rules have also one of following actions: Replace, Preface, or Append. When the rule is evaluated, the result replaces, appends, or prefaces the existing value.

You can see how some of the rules are used in the section Create test data.

XML configuration

Configuration is located in the XML file that is generated by DB2 Test Database Generator tools. Because of the XML, test data creation can be automated by defined XML files using other tools as well.

Referential integrity

The referential integrity support enables you to extract a referentially intact slice of data from an existing DB2 database. This feature requires the DB2 Grouper component. DB2 Grouper is included in the DB2 Test Database Generator product, and enables the tool to manage and recognize referential integrity.

Batch creation of test data

Using batch creation and XML configuration, test data can be created automatically with the automation tools and generation can be executed without user interaction.

Installation

This section covers the installation of DB2 Test Database Generator on the Windows platform. The DB2 version in this example is is Version 9, and the SAMPLE database that comes with every DB2 is used later as an example.

Installation of DB2 is out of the scope of this article and it is assumed that DB2 version 9.1 is installed and the SAMPLE database created before you install DB2 Test Database Generator.

Setup

DB2 Test Database Generator has server and client components. The server component is located typically in the target database, and setup actually creates a few tables for the test generator. Figure 2 shows a typical configuration

Figure 2. Typical configuration
Typical configuration

For the example In this article, a DB2 Test Database Generator has been set up on one Windows machine, so client and server are located on the same machine.

Install the server

The installation process for the DB2 Test Database Generator Server has two parts, the server component and the administrator component. Both components are automatically installed by the installer software. The administrator component is installed with the installation wizard, and then the administrator component is used to install server component to the DB2 database where you want to generate test data.

Environment variable

You can also set the variable using the command line.

For Windows use:
set DB2TEMPDIR=%DB2PATH%

For Linux use:
export DB2TEMPDIR=$DB2PATH

The DBPATH variable points to the directory where DB2 is installed (for example, d:\IBM\SQLLIB).

Important! Before starting the installation process, you must add a new environment variable DB2TEMPDIR as a new system variable. The easiest way to add the variable is to add it in Windows system properties as the next figure shows.

Figure 3. Add environment variable
Add environment variable

After you have set the environment variable, start installation executing command TDBG_2.2_FP1_srv_win32.exe and follow the installation instructions in DB2 Test Database Generator User's Guide (see the Resources section).

Important! Step 8 in the User's Guide must be done, otherwise DB2 Test Database Generator Client cannot be used to connect to the database and to generate test data. Step 8 says that the GRI_LOCATIONS table must be populated (and the PUBLISH column must be set to 1) with databases that you want to connect with DB2 Test Database Generator Client. This table has been populated by the installation program, and you need to manually set the PUBLISH-column value. This can be done using the Control Center or the DB2 command line processor with the following command:

Set GRI_LOCATIONS PUBLISH value
D:\IBM\SQLLIB\BIN>db2 update systools.gri_locations set publish=1

Install the client

The DB2 Test Database Generator Client is used to connect to DB2 Test Database Generator Server in order to actually create the test data. Client can be used to connect to many databases, local and remote, where the server component has been installed. Start installation with the command TDBG_2.2_FP1_cli_win32.exe and follow the instructions.


Create test data

After installing the server and client, DB2 Test Database Generator is ready for generating test data. In the following example, test data is created from scratch using some transformation rules described earlier. The database table for the example is the STAFF table in the DB2 SAMPLE database.

Creating test data is done using the DB2 Test Database Generator Client, and has the following steps:

  1. Connect to a DB2 Test Database Generator server.
  2. Create a DB2 Test Database Generator source (not required when generating data from scratch).
  3. Create a DB2 Test Database Generator target.
  4. Generate test data.
  5. Download test data.

Connect to a DB2 Test Database Generator server

After starting the DB2 Test Database Generator Client, it opens a connect page. Use the connect page in the Client user interface to connect to the database where you have installed the DB2 Test Database Generator Server component.

Figure 4. Connect to SAMPLE database
Connect to SAMPLE database

If you have multiple databases in the Locations list box (as shown in Figure 4), remember to choose the one where you installed the DB2 Test Database Generator Server component. In this example, the DB2 Type 2 JDBC driver has been selected, but the Type 4 JDBC driver can also be used. Click Connect and the Client connects to the server.

Create a DB2 Test Database Generator target

Step 2 (create DB2 Test Database Generator source) was skipped because the data is being created from scratch.

Select the Targets page in the Client user interface, and click Add. This opens the Target Properties dialog, as shown in Figure 5.

Figure 5. Add new target
Add new target

In the Target Properties dialog, enter STAFF (table name) as the name of the target. Rows property is the number of rows to be generated, in this sample five is enough.

Since you want to use an existing table for test data, click From table in order to open the Table Selection dialog. In the Table Selection dialog, accept default (filter) values, and click Refresh. You could also add each column manually.

Figure 6. Table Selection
Table Selection

Select the STAFF table, and click OK. The Target Properties dialog is populated with the table columns and types.

Figure 7. Populated Target Properties
Populated Target Properties

Generation rules

In this sample for STAFF-table test data, two generation rules have been used: random and pattern.

Random
Used to generate random data for data types: integer, decimal, date, time, and timestamp. The rule also has a lower and upper limit for random data.

Pattern
Used to generate characters and strings. This rule has a formula as the value and it is used to generated random characters from a specified list of characters. The rule is also used as a string selector where one string in a collection is randomly chosen.

Examples of both rule types are found in Table 1. DB2 Test Database Generator User's Guide includes a description and sample of other rule types.

Next, add rules for generating the test data. At least one rule must be specified for each column. Select the first column (ID), and select the Rules tab. Under the Rules tab, select Random as the rule type, Integer as the type of value, and specify a min and max values for the ID column (for example 0 and 100). Then, click Add (see Figure 8).

Figure 8. Add a rule for a column
Add a rule for a column

Then specify rules for other columns as described in Table 1.

Table 1. Generation rules for STAFF table columns
ColumnRule typeRule properties
IDrandomType of value: Integer
Min: 0
Max: 100
Operation: Replace
NAMEpatternValue: [SAMI]*4
Operation: Replace
DEPTpatternValue: (A|B|C|D|E|F)
Operation: Replace
randomType of value: Integer
Min: 201
Max: 399
Operation: Append
JOBpatternValue: (Dev|Sales|Mgmt|R&D|Svcs)
Operation: Replace
YEARSrandomType of value: Integer
Min: 1
Max: 5
Operation: Replace
SALARYrandomType of value: Decimal
Min: 5000.00
Max: 12000.00
Operation: Replace
COMMrandomType of value: Decimal
Min: 1000.00
Max: 2000.00
Operation: Replace

Built-in generation rules are quite powerful and very often more than enough for test data purposes. If built-in generation rules are not enough, DB2 Test Database Generator also offers user definer rules, where users can use Java to develop new generation rule.

Select options for different targets (such as SQL and CSV). Generate using the Generator page. See the Output page for output files, or go to the directory where the files are (only when the client and server are on the same machine).

Generate test data

Before generating test data, save data by going to File > Save as.... Then open the Generator page, and click Start. This generates test data for the target specified in the previous section. After a short while, test data is generated and the log is displayed on the Generator page.

Figure 9. Test data generated
Test data generated

The test data format is specified on Target Options dialog located on the Targets page. The default format is CSV but data can be generated as SQL (used in this sample), fixed text, XML, or directly to DB2.

Download test data

Generated test data is located in installRoot/user/data profile name/ directory, where installRoot is the path where the DB2 Test Database Generator Server is installed, the user is the user ID used to connect to the server, and data profile name is the profile name you specified when saving the test profile.

If you have the server and client on separate machines, you need to download data to the local machine from the Output page.

Figure 10. Test data output files
Test data output files

Generated test data includes the XML configuration of the data profile (with the extension .griml), log files, and the generated test data in XML, CSV, or SQL formats (see Listings 2-4). Data definition language (DDL) of the target table is also included in the directory.

The generated DDL for the STAFF table is in Listing 1.

Listing 1. STAFF table DDL
CREATE TABLE ADMINISTRATOR.STAFF
(
    ID SMALLINT NOT NULL,
    NAME VARCHAR(9),
    DEPT SMALLINT,
    JOB CHAR(5),
    "YEARS" SMALLINT,
    SALARY DECIMAL(7, 2),
    COMM DECIMAL(7, 2)
);

The test data as SQL is shown in Listing 2.

Listing 2. Test data as SQL inserts
INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) 
                          VALUES (9, 'MMSM', D249, 'Sales', 3, 5886.28, 1039.25);
INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) 
                          VALUES (8, 'SIII', E280, 'Svcs', 4, 9155.32, 1416.99);
INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) 
                          VALUES (41, 'IAAI', D287, 'Sales', 3, 10856.39, 1436.19);
INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) 
                          VALUES (53, 'IMII', F382, 'Sales', 3, 10654.44, 1030.21);
INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) 
                          VALUES (11, 'SAIM', D316, 'Dev', 3, 7060.19, 1359.52);

For the second test data generation, if you change the output to XML (in the Target options), the result is shown in Listing 3.

Listing 3. More test data in XML format
<?xml version="1.0" encoding="UTF-8" ?>
<RESULTS>
    <ROW num="1">
        <COL name="ID">66</COL>
        <COL name="NAME">SAMM</COL>
        <COL name="DEPT">F377</COL>
        <COL name="JOB">Dev</COL>
        <COL name="YEARS">3</COL>
        <COL name="SALARY">10264.73</COL>
        <COL name="COMM">1245.07</COL>
    </ROW>
    <ROW num="2">
        <COL name="ID">38</COL>
        <COL name="NAME">IAAS</COL>
        <COL name="DEPT">F284</COL>
        <COL name="JOB">Svcs</COL>
        <COL name="YEARS">4</COL>
        <COL name="SALARY">7286.96</COL>
        <COL name="COMM">1418.62</COL>
    </ROW>
    <ROW num="3">
        <COL name="ID">10</COL>
        <COL name="NAME">IAMM</COL>
        <COL name="DEPT">E317</COL>
        <COL name="JOB">Sales</COL>
        <COL name="YEARS">2</COL>
        <COL name="SALARY">8303.52</COL>
        <COL name="COMM">1956.09</COL>
    </ROW>
    <ROW num="4">
        <COL name="ID">75</COL>
        <COL name="NAME">MIIA</COL>
        <COL name="DEPT">B395</COL>
        <COL name="JOB">Sales</COL>
        <COL name="YEARS">3</COL>
        <COL name="SALARY">5062.28</COL>
        <COL name="COMM">1551.02</COL>
    </ROW>
    <ROW num="5">
        <COL name="ID">28</COL>
        <COL name="NAME">SSII</COL>
        <COL name="DEPT">D375</COL>
        <COL name="JOB">Dev</COL>
        <COL name="YEARS">2</COL>
        <COL name="SALARY">11826.06</COL>
        <COL name="COMM">1939.58</COL>
    </ROW>
</RESULTS>

For the third test data generation, the output is CSV (delimited text in Target options), as shown in Listing 4.

Listing 4. Even more test data in CSV format
22,MMMI,C351,Sales,5,8086.68,1778.82
13,SASM,D231,Mgmt,3,9425.59,1422.77
83,SISA,A392,Sales,2,5633.05,1338.64
87,AMMS,A245,R&D,3,6552.82,1573.17
14,AIIM,E211,Ramp;D,2,9977.95,1072.96

Two other possibilities for output, which are not covered here, are fixed width text and direct input to another DB2 table.


Summary

Mastering the DB2 Test Database Generator is not effortless, but it is well worth it. In some testing situations, you may only have SQL scripts to create the database but no data, and in other situations you may have a very limited set of test data. This tool enables you to generate a realistic set of data for valid testing.

IBM DB2 Test Database Generator is a tool for creating test data directly to the DB2 database, or the test data can be in CSV, XML, or SQL format. Therefore, allowing generated data to be used for testing in any database system.

Resources

Learn

Get products and technologies

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=228643
ArticleTitle=Get started with IBM DB2 Test Database Generator
publish-date=06072007