Get started with IBM DB2 Test Database Generator
Create realistic test data for your database application development projects
Almost all developers have written database applications where test data was needed in order to be able to check the validity of their applications. A database that is defined, but does not yet contain data, is not really that useful for making sure your application will work as designed. In some cases, you may have access to test data or an event snapshot of production data. However, many times you may have a limited set of test data, or perhaps you don't have access to data at all. Writing test data generators or scripts is time consuming and can be very much out of the scope of the task at hand.
Luckily, IBM DB2 Test Database Generator makes test data generation relatively easy and painless. IBM DB2 Test Database Generator is used to generate test data from scratch or from existing data. Test data can be generated in a variety of formats, including SQL, CSV, or XML.
DB2 Test Database Generator is an add-on tool for DB2 that helps developers and administrators quickly create test data from scratch or from existing data. It's main features are:
- Test data generation
- Rule-based transformations
- XML configuration
- Referential integrity support
- Batch creation of test data
Generate test data
You can generate test data from scratch or from existing data. DB2 Test Database Generator can have sources and corresponding targets that specify where original data is located and where to store generated data.
Sources can be database tables in existing database or files, such as XML or CSV, where data is located. Using such data sources, administrators or developers can copy existing data or sample existing data for testing and development purposes.
Targets are destinations for the generated test data. Targets can be: SQL, XML, delimited text, fixed-width text, and DB2 (direct to a DB2 database). If you are using z/OS, DB2 Load format is also available. Each target corresponds to a source, except when you are creating test data from scratch. See the section Create test data for an example of creating test data from scratch.
Figure 1 shows the process of creating test data.
Figure 1. Figure 1. Creating test data
When you are generating test data from scratch, there are no sources, and test data is generated using generation rules.
Rules are methods that specify how test data is created. When using the tool without a data source, rules are used to generate test data from scratch. Using rules together with a data source, you can specify which target (column) gets the data (from source column).
There are several different types of rules that can be used when creating test data:
- Static rule: Constant value for all columns.
- Sourcecol rule: Defines a column in the source data that is used for the target value.
- Lookup rule: Retrieve values for the source, based on a column search on the current value.
- Mask rule: Modifies values by replacing positions within a value with a pattern value or a static value.
- Expression rule: Defines a formula to be used to dynamically calculate the value of the column.
- Random rule: Generates random data of a specified type within certain bounds.
- Pattern rule: Specifies a pattern to be used to generate a value. Patterns can be used for character generation or string selector.
- User-defined rule: Define your own rules by extending abstract Java class com.ibm.db2.gri.server.userFunction.UserFuction.
Rules have also one of following actions: Replace, Preface, or Append. When the rule is evaluated, the result replaces, appends, or prefaces the existing value.
You can see how some of the rules are used in the section Create test data.
Configuration is located in the XML file that is generated by DB2 Test Database Generator tools. Because of the XML, test data creation can be automated by defined XML files using other tools as well.
The referential integrity support enables you to extract a referentially intact slice of data from an existing DB2 database. This feature requires the DB2 Grouper component. DB2 Grouper is included in the DB2 Test Database Generator product, and enables the tool to manage and recognize referential integrity.
Batch creation of test data
Using batch creation and XML configuration, test data can be created automatically with the automation tools and generation can be executed without user interaction.
This section covers the installation of DB2 Test Database Generator on the Windows platform. The DB2 version in this example is is Version 9, and the SAMPLE database that comes with every DB2 is used later as an example.
Installation of DB2 is out of the scope of this article and it is assumed that DB2 version 9.1 is installed and the SAMPLE database created before you install DB2 Test Database Generator.
DB2 Test Database Generator has server and client components. The server component is located typically in the target database, and setup actually creates a few tables for the test generator. Figure 2 shows a typical configuration
Figure 2. Figure 2. Typical configuration
For the example In this article, a DB2 Test Database Generator has been set up on one Windows machine, so client and server are located on the same machine.
Install the server
The installation process for the DB2 Test Database Generator Server has two parts, the server component and the administrator component. Both components are automatically installed by the installer software. The administrator component is installed with the installation wizard, and then the administrator component is used to install server component to the DB2 database where you want to generate test data.
Important! Before starting the installation process, you must add a new environment variable DB2TEMPDIR as a new system variable. The easiest way to add the variable is to add it in Windows system properties as the next figure shows.
Figure 3. Figure 3. Add environment variable
After you have set the environment variable, start installation executing
TDBG_2.2_FP1_srv_win32.exe and follow the
installation instructions in DB2 Test Database Generator User's Guide (see
the Related topics section).
Important! Step 8 in the User's Guide must be done, otherwise DB2 Test Database Generator Client cannot be used to connect to the database and to generate test data. Step 8 says that the GRI_LOCATIONS table must be populated (and the PUBLISH column must be set to 1) with databases that you want to connect with DB2 Test Database Generator Client. This table has been populated by the installation program, and you need to manually set the PUBLISH-column value. This can be done using the Control Center or the DB2 command line processor with the following command:
D:\IBM\SQLLIB\BIN>db2 update systools.gri_locations set publish=1
Install the client
The DB2 Test Database Generator Client is used to connect to DB2 Test
Database Generator Server in order to actually create the test data.
Client can be used to connect to many databases, local and remote, where
the server component has been installed. Start installation with the
TDBG_2.2_FP1_cli_win32.exe and follow the
Create test data
After installing the server and client, DB2 Test Database Generator is ready for generating test data. In the following example, test data is created from scratch using some transformation rules described earlier. The database table for the example is the STAFF table in the DB2 SAMPLE database.
Creating test data is done using the DB2 Test Database Generator Client, and has the following steps:
- Connect to a DB2 Test Database Generator server.
- Create a DB2 Test Database Generator source (not required when generating data from scratch).
- Create a DB2 Test Database Generator target.
- Generate test data.
- Download test data.
Connect to a DB2 Test Database Generator server
After starting the DB2 Test Database Generator Client, it opens a connect page. Use the connect page in the Client user interface to connect to the database where you have installed the DB2 Test Database Generator Server component.
Figure 4. Figure 4. Connect to SAMPLE database
If you have multiple databases in the Locations list box (as shown in Figure 4), remember to choose the one where you installed the DB2 Test Database Generator Server component. In this example, the DB2 Type 2 JDBC driver has been selected, but the Type 4 JDBC driver can also be used. Click Connect and the Client connects to the server.
Create a DB2 Test Database Generator target
Step 2 (create DB2 Test Database Generator source) was skipped because the data is being created from scratch.
Select the Targets page in the Client user interface, and click Add. This opens the Target Properties dialog, as shown in Figure 5.
Figure 5. Figure 5. Add new target
In the Target Properties dialog, enter STAFF (table name) as the name of the target. Rows property is the number of rows to be generated, in this sample five is enough.
Since you want to use an existing table for test data, click From table in order to open the Table Selection dialog. In the Table Selection dialog, accept default (filter) values, and click Refresh. You could also add each column manually.
Figure 6. Figure 6. Table Selection
Select the STAFF table, and click OK. The Target Properties dialog is populated with the table columns and types.
Figure 7. Figure 7. Populated Target Properties
Next, add rules for generating the test data. At least one rule must be specified for each column. Select the first column (ID), and select the Rules tab. Under the Rules tab, select Random as the rule type, Integer as the type of value, and specify a min and max values for the ID column (for example 0 and 100). Then, click Add (see Figure 8).
Figure 8. Figure 8. Add a rule for a column
Then specify rules for other columns as described in Table 1.
Table 1. Table 1. Generation rules for STAFF table columns
|Column||Rule type||Rule properties|
|ID||random||Type of value: Integer|
|random||Type of value: Integer|
|YEARS||random||Type of value: Integer|
|SALARY||random||Type of value: Decimal|
|COMM||random||Type of value: Decimal|
Built-in generation rules are quite powerful and very often more than enough for test data purposes. If built-in generation rules are not enough, DB2 Test Database Generator also offers user definer rules, where users can use Java to develop new generation rule.
Select options for different targets (such as SQL and CSV). Generate using the Generator page. See the Output page for output files, or go to the directory where the files are (only when the client and server are on the same machine).
Generate test data
Before generating test data, save data by going to File > Save as.... Then open the Generator page, and click Start. This generates test data for the target specified in the previous section. After a short while, test data is generated and the log is displayed on the Generator page.
Figure 9. Figure 9. Test data generated
The test data format is specified on Target Options dialog located on the Targets page. The default format is CSV but data can be generated as SQL (used in this sample), fixed text, XML, or directly to DB2.
Download test data
Generated test data is located in installRoot/user/data profile name/ directory, where installRoot is the path where the DB2 Test Database Generator Server is installed, the user is the user ID used to connect to the server, and data profile name is the profile name you specified when saving the test profile.
If you have the server and client on separate machines, you need to download data to the local machine from the Output page.
Figure 10. Figure 10. Test data output files
Generated test data includes the XML configuration of the data profile (with the extension .griml), log files, and the generated test data in XML, CSV, or SQL formats (see Listings 2-4). Data definition language (DDL) of the target table is also included in the directory.
The generated DDL for the STAFF table is in Listing 1.
Listing 1. Listing 1. STAFF table DDL
CREATE TABLE ADMINISTRATOR.STAFF ( ID SMALLINT NOT NULL, NAME VARCHAR(9), DEPT SMALLINT, JOB CHAR(5), "YEARS" SMALLINT, SALARY DECIMAL(7, 2), COMM DECIMAL(7, 2) );
The test data as SQL is shown in Listing 2.
Listing 2. Listing 2. Test data as SQL inserts
INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) VALUES (9, 'MMSM', D249, 'Sales', 3, 5886.28, 1039.25); INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) VALUES (8, 'SIII', E280, 'Svcs', 4, 9155.32, 1416.99); INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) VALUES (41, 'IAAI', D287, 'Sales', 3, 10856.39, 1436.19); INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) VALUES (53, 'IMII', F382, 'Sales', 3, 10654.44, 1030.21); INSERT INTO ADMINISTRATOR.STAFF (ID, NAME, DEPT, JOB, "YEARS", SALARY, COMM) VALUES (11, 'SAIM', D316, 'Dev', 3, 7060.19, 1359.52);
For the second test data generation, if you change the output to XML (in the Target options), the result is shown in Listing 3.
Listing 3. Listing 3. More test data in XML format
<?xml version="1.0" encoding="UTF-8" ?> <RESULTS> <ROW num="1"> <COL name="ID">66</COL> <COL name="NAME">SAMM</COL> <COL name="DEPT">F377</COL> <COL name="JOB">Dev</COL> <COL name="YEARS">3</COL> <COL name="SALARY">10264.73</COL> <COL name="COMM">1245.07</COL> </ROW> <ROW num="2"> <COL name="ID">38</COL> <COL name="NAME">IAAS</COL> <COL name="DEPT">F284</COL> <COL name="JOB">Svcs</COL> <COL name="YEARS">4</COL> <COL name="SALARY">7286.96</COL> <COL name="COMM">1418.62</COL> </ROW> <ROW num="3"> <COL name="ID">10</COL> <COL name="NAME">IAMM</COL> <COL name="DEPT">E317</COL> <COL name="JOB">Sales</COL> <COL name="YEARS">2</COL> <COL name="SALARY">8303.52</COL> <COL name="COMM">1956.09</COL> </ROW> <ROW num="4"> <COL name="ID">75</COL> <COL name="NAME">MIIA</COL> <COL name="DEPT">B395</COL> <COL name="JOB">Sales</COL> <COL name="YEARS">3</COL> <COL name="SALARY">5062.28</COL> <COL name="COMM">1551.02</COL> </ROW> <ROW num="5"> <COL name="ID">28</COL> <COL name="NAME">SSII</COL> <COL name="DEPT">D375</COL> <COL name="JOB">Dev</COL> <COL name="YEARS">2</COL> <COL name="SALARY">11826.06</COL> <COL name="COMM">1939.58</COL> </ROW> </RESULTS>
For the third test data generation, the output is CSV (delimited text in Target options), as shown in Listing 4.
Listing 4. Listing 4. Even more test data in CSV format
22,MMMI,C351,Sales,5,8086.68,1778.82 13,SASM,D231,Mgmt,3,9425.59,1422.77 83,SISA,A392,Sales,2,5633.05,1338.64 87,AMMS,A245,R&D,3,6552.82,1573.17 14,AIIM,E211,Ramp;D,2,9977.95,1072.96
Two other possibilities for output, which are not covered here, are fixed width text and direct input to another DB2 table.
Mastering the DB2 Test Database Generator is not effortless, but it is well worth it. In some testing situations, you may only have SQL scripts to create the database but no data, and in other situations you may have a very limited set of test data. This tool enables you to generate a realistic set of data for valid testing.
IBM DB2 Test Database Generator is a tool for creating test data directly to the DB2 database, or the test data can be in CSV, XML, or SQL format. Therefore, allowing generated data to be used for testing in any database system.
- DB2 Product Family evaluation downloads: View downloads available for specific products.
- IBM DB2 Test Database Generator User's Guide: Find an overview of Test Database Generator and its functions, as well as tasks for installing, implementing, and using Test Database Generator.
- IBM DB2 Test Database Generator for z/OS and for Multiplatforms, Version 2.2: Read a fact sheet on the DB2 Test Database Generator.
- DB2 Product Family page: Find information on a variety of products from the DB2 family.
- Build your next development project with IBM trial software, available for download directly from developerWorks.