Using IBM Red Brick Warehouse with IBM DB2 Warehouse Manager

This document shares experiences of making IBM DB2 Warehouse Manager work with IBM Red Brick Warehouse.

Qi Jin, Software developer manager , IBM Silicon Valley Lab in San Jose, California

Qi Jin is a software developer manager at the IBM Silicon Valley Lab in San Jose, California. He has worked in the data loading and kernel areas for IBM Red Brick Warehouse for the past six years. You can reach Qi Jin at qijin@us.ibm.com.



09 January 2003

© 2003 International Business Machines Corporation. All rights reserved.

Read the disclaimer first.

Introduction

Data warehouse administrators frequently face the challenge of integrating data in a heterogeneous database environment. You may have data stored in DB2®, but want to take advantage of the capabilities of IBM Red Brick Warehouse to analyze your data. So you're faced with the problem of transferring data from DB2 to Red Brick. What's the best way to accomplish your goal?

Integrating these two database systems through IBM DB2 Warehouse Manager brings technical advantages and business benefits. In this article, I'll share my experiences of making DB2 Warehouse Manager work with Red Brick Warehouse. The configuration suggestions which follow are intended to help you to automate large data transfers between DB2 and Red Brick Warehouse installations.

In an Export, Transform and Load (ETL) process, a database can be either a source of the data flow, or a target of the data flow. In DB2 Warehouse Manager, Red Brick can be configured to export data (as the source) or to receive imported data (as the target) from other sources, such as DB2. These are the areas of concern in making these products work together. Other than configuring a specific Red Brick data source or target, Warehouse Manager processes involving Red Brick are no different than any other data source when you're using data warehouse functions such as transformation, job control and scheduling.

This article has two sections:

  • First we look at how to export data from Red Brick.
    I'll discuss the options for exporting data from a Red Brick Warehouse: using ODBC, or using the Red Brick SQL Export command.
  • The second section covers how to import data into Red Brick.
    Here I'll discuss the ways of importing data into a Red Brick Warehouse: using ODBC, or using the Red Brick loader, called Table Management Utility (TMU), through a User Defined Program (UDP).

I developed and tested the recommendations in this article on a Windows NT® 4.0 workstation with the following software installed:

  • IBM DB2 Universal DatabaseTM V7.2 (with DB2 Control Center) and DB2 Warehouse Manager
  • IBM Red Brick Warehouse 6.11

Before you start

We'll assume that you already have basic working knowledge of DB2 Warehouse Manager as well as Red Brick Warehouse. The business intelligence tutorial in DB2's "First Steps" option is a good starting point for Warehouse Manager. On the other side, walking through the AROMA database example in Red Brick's Administrator's Guide is an excellent way to learn Red Brick.

In my examples I use a Red Brick database called AROMA. AROMA is accessed using the user name system and the password manager. We will be using a table called MARKET which is defined in AROMA. The scripts for creating this database can be found in the Red Brick installation under the sample_input directory.


Exporting data from Red Brick

You may export data from a Red Brick warehouse in one of two ways.

Use ODBC

A Red Brick database can be defined as a Generic ODBC source in DB2 Warehouse Manager by following these steps:

  1. First, you need to define an ODBC data source name (DSN) to access a Red Brick database.
  2. Then, in DB2 Data Warehouse Center, right click the Warehouse Sources folder, then follow Define->Generic ODBC to finish the standard configuation steps.
  3. After that, you can use an SQL step in the Warehouse Manager to retrieve data from any Red Brick table through the ODBC interface.

Use Red Brick SQL Export

First, follow the steps described above in "Use ODBC". Let us assume you used sqlstr as the query string in the original SQL step. Change the query string to:

export to output_file ddl_file tmu_file format_type ( sqlstr );

The detailed export syntax can be found in Red Brick's SQL Reference Guide. The export statement is invoked through ODBC just like any other normal query, but it directly writes the output of the query execution into files, a directory, or a pipe, bypassing the overhead of the ODBC layer. The supported file formats include Fixed-Length, Delimited, and Binary. In addition, support for XML format was added in the Red Brick 6.2 release. All the output files from the SQL Export must be accessible from the local machine.

For example, to export every row in the MARKET table to c:\temp\market.txt, the query string of the SQL step will look like the following:

export to 'c:\temp\market.txt' ddlfile 'c:\temp\market.ddl' tmufile 'c:\temp\market.tmu' 
format external (select * from market);

When you specify external format, as we did in the command above, each column in the output will be stored in fixed length. Here is a sample row from market.txt:

00000000001 Atlanta         GA            Atlanta         South			 
....

As for the other two files, the market.ddl contains a create table statement, which defines each column in the select list column of the query as a table column.The market.tmu file contains a TMU script for loading the market.txt file into the table defined by market.ddl.

The information supplied in these two files can be used to define the schema information for the exported files. Users can then define the exported files as Flat File Warehouse Sources and use them in later ETL steps.

Which export method should you use?

Of the two export methods, the SQL Export method is the preferred one because of its high performance. Note that the files that result from SQL Export must be accessible from the local machine. If users need to move data between machines, they can use an additional FTP step to ship the result files to a remote machine.

The ODBC method is slower than SQL Export. On the other hand, it can move data across machines without requiring additional steps.


Importing data into Red Brick

Like exporting data, you have a choice between two methods for importing data.

Use ODBC

A Red Brick database can be defined as an ODBC data target. There is no native Red Brick data target in DB2 V7.2. Luckily, there's a workaround. Users can use the existing target type "DB2 UDB for VSE". But there are a few problems to be aware of if you choose to use this workaround.

Although the VSE target uses ODBC, it embeds some DB2-only logic. There are two main problems.

  • First, the VSE target tries to find a table called IBMSNAP_REGISTER when retrieving the table list. To get around this problem, you will need to create the table. Log in as a DBA and issue the following SQL to the Red Brick database:
    grant connect, resource to asn with asn; 
     
    connect as asn asn; 
     
    create table ibmsnap_register 
    ( 
    source_owner 				char(18)not null, 
    source_table				char(18)not null, 
    source_view_qual			smallint not null, 
    global_record				char(1) not null, 
    source)_structure			smallint not null, 
    source_condensed			char(1) not null, 
    source_complete				char(1) not null, 
    cd_owner 				char(18), 
    cd_table				char(18), 
    phys_change_owner			char(18), 
    phys_change_table			char(18), 
    cd_old_synchpoint			char(10), 
    cd_new_synchpoint 			char(10), 
    disable_refresh				smallint not null, 
    ccd_owner				char(18), 
    ccd_table				char(18), 
    ccd_old_synchpoint			char(10), 
    synchpoint				char(10), 
    synchtime				timestamp, 
    ccd_condensed				char(1), 
    ccd_complete				char(1), 
    arch_level				char(4) not null, 
    description				char(254), 
    before_img_prefix			varchar(4), 
    conflict_level				char(1), 
    partition_keys_chg			char(1) 
    );

    After defining the IBMSNAP_REGISTER table, you can add the Red Brick database as a Warehouse ODBC Target by following this click sequence: Warehouse Target->Define->ODBC target->DB2->DB2 VSE in Data Warehouse Center.

  • The second problem is that the VSE target generates a CREATE TABLE statement using a table name preceded with a user name qualifier. Unfortunately, Red Brick does not currently support that syntax.

    Because of this syntax difference, you cannot use the "Automatically Generate Target Table" option. To avoid the problem, for SQL steps using the Red Brick target, bring up its properties dialog box, and under the Target Table tab, uncheck "Data Warehouse Center created table". You then must use the Red Brick RISQL program to create the target table separately.

    Note that this problem is fixed in the Red Brick Warehouse 6.20.xC2 release where user name qualifiers are supported.

After resolving these two issues, the Red Brick target can be defined successfully and many functions can be performed.

Use a TMU UDP program

Users can configure a UDP program in Warehouse Manager to use the Red Brick TMU, the high-speed loader for Red Brick Warehouse. To set up UDP parameters, a user specifies a TMU executable path, a TMU control file, a target database, user name and password. After that, the TMU UDP will be available on the tool panel of the Process Model window. It can be selected to create a step in an ETL process. When the step is executed, the TMU will be invoked to load data into the Red Brick database. Please note that the target database does not need to be defined as a Warehouse Target. Note also that the input files here must be accessible from the local machine.

Let's look at an example of defining a TMU UDP. For our example the TMU executable is c:\redbrick\rbw\bin\rb_tmu.exe. The exact location of the TMU executable depends on the Red Brick installation directory. Instead of system and manager, you may have a different user name and password. We want to use this TMU to load c:\temp\market.txt into the target table MARKET in the Red Brick AROMA database.

Here is the SQL defining the MARKET table:

create table market( 
  mktkey integer not null, 
  hq_city char(20) not null, 
  hq_state char(20) not null, 
  district char(20) not null, 
  region char(20) not null);

The input file c:\temp\market.txt is a text file delimited by '*':

04*Houston*TX*New Orleans*South 
05*New York*NY*New York*North

The following TMU control file c:\temp\market.tmu instructs TMU to load market.txt into the MARKET table:

  load data 
	inputfile 'c:\temp\market.txt' 
	append 
	format separated by '*' 
	into table market ( 
	mktkey		integer external, 
	hq_city		char, 
	hqstate		char, 
	district	char, 
	region		char);

Follow these steps to define a TMU UDP:

  1. In the left panel of the Data Warehouse Center window, open the User-Defined Programs and Transformers folder in the Administration folder.
  2. Right-click on it to define a group named "Red Brick TMU".
  3. Right-click on the "Red Brick TMU group" to bring up the "Define Program" window. Name it "Market TMU".
  4. In the Program tab, click Insert, and then edit the parameter entry as follows:
    Name: TMU parameter 
    Value: -d AROMA c:\temp\market.tmu system manager > c:\temp\log 2>&1
  5. In the Agent sites tab, add "Default DWC AgentSite" to the selected agent sites.
  6. Click OK and do a refresh to finish the definition process.

After that, the icon for Market TMU UDP appears in the left panel of the Process Model window. Usually, the Market TMU UDP step immediately follows the previous step, which creates market.txt. The UDP can then be scheduled to run at a specified time or on demand. The TMU's output is in c:\temp\log.

When there is a warning in the load process, for example, a column truncation, TMU typically prints it and proceeds with the load. After the TMU finishes, it exits with return code 1. Because the Warehouse Manager treats any return code of 1 as an error, it will pop up an error message box even when the TMU finishes the load successfully with expected warning messages. Users can choose to ignore this particular error message. The log output in c:\temp\log can be used to check a load's status.

As you can see from the process above, other Red Brick programs can also be used in a very similar way through a UDP program. For example, to get a parallel load, one can use rb_ptmu.exe. In the Red Brick 6.2 release, a client TMU rb_ctmu.exe is added to load data from a local machine to a remote server directly. Using rb_ctmu.exe in the UDP, we can get remote loading.

Which import method should you use?

Of the two import methods I discussed, I would recommend the TMU UDP method for handling large volumes of data. This method does require that the input files and the TMU file be accessible from the local machine. If accessibility is a problem, you can use the client TMU in the Red Brick 6.2 release or use an additional FTP step to ship the files to the local machine.

The ODBC method can move data across machines. However, because it is much slower, the ODBC method is only suitable for handling small volumes of data.


Conclusion

In this article we discussed the alternatives for importing data into and exporting data from Red Brick Warehouse using DB2 Warehouse Manager. Because of their high performance, I recommend the SQL Export command and the TMU UDP program for exporting and importing large amounts of data. Either method will give you one more tool for managing your data warehouse.


Disclaimer

The configuration recommendations contained in this article were derived under specific operating and environmental conditions. While the information has been reviewed for accuracy under the given conditions, the results obtained in your operating environments may vary significantly. Accordingly, IBM does not provide any representations, assurances, guarantees, or warranties regarding results or performance.


Acknowledgements

I am grateful to Rick Cole, Bob Rumsby, Loganathan Venkatachalam, Albert Yao, and Cheung-Yuk Wu for their insightful comments.

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=14403
ArticleTitle=Using IBM Red Brick Warehouse with IBM DB2 Warehouse Manager
publish-date=01092003