Using the Java JoinRowSet implementation with DB2 UDB

Joining while disconnected

The efforts of JSR114 bring you the offering of serializable disconnected objects in Java™ 1.5. With these objects, you can connect to a database and grab data locally. You can then release your database connection and manipulate the data offline. At a later time, you can reconnect to your database and synchronize changes made to the disconnected objects. In this article, I will show you how you can perform offline JOIN operations on your IBM® DB2® Universal Database™ data using the JoinRowSet interface. Sample code included.

Kulvir Singh Bhogal (kbhogal@us.ibm.com), Software Services for WebSphere, Fort Worth, TX, IBM

Kulvir Bhogal photoKulvir Singh Bhogal works as an IBM consultant, devising and implementing Java-centric solutions at customer sites across the nation.



07 July 2005

Introduction

The SQL JOIN statement has long been in the toolkit of database programmers, allowing you to combine the data contained in two or more relational database tables based upon a common attribute. In the past, the act of joining had to occur on the database server. In a previous developerWorks article I introduced you to the CachedRowSet Java interface. The CachedRowSet interface, new to Java 1.5, brings us the ability to have serializable disconnected objects which house your database data. CachedRowSet objects allow for offline manipulation of database data and can be resynchronized with the database where the objects were grabbed. You can learn more about the CachedRowSet interface in my article at http://www.ibm.com/developerworks/db2/library/techarticle/dm-0406bhogal/.

In this article I shed light on a subinterface of the CachedRowSet interface called the JoinRowSet interface. As you will see, the JoinRowSet Java interface allows you to perform JOIN-like operations on your database data while disconnected from your database. The JOIN operation occurs on the client, not the database server. Based upon your findings in this article, you might want to consider using JoinRowSet objects in your application architecture. Doing so can offload computational stress from your IBM DB2 Universal Database Server.


Some words about lineage

As stated earlier, the JoinRowSet interface is a subinterface of the CachedRowSet interface. Accordingly, we can piggy-back off of the offerings of the CachedRowSet interface which I shed light on in my CachedRowSet article. The CachedRowSet interface derives from thejavax.sql.RowSet interface which in turn is a subinterface of the java.sql.ResultSet interface.

Since Java does not allow you to instantiate an interface, we need an implementation of the JoinRowSet interface to perform our study. In Java 1.5, Sun ships the com.sun.rowset.JoinRowSetImpl class (hereafter referred to as the JoinRowSetImpl class. We will be focusing our study on the JoinRowSetImpl class. There are other implementations of the JoinRowSet interface offered by other vendors, but I focus this study on the reference implementation provided by Sun. If you are using a pre-1.5 version of Java, and would still like to tap into the offerings of the JoinRowSet as well as other RowSet objects, you can download the reference implementations separately at http://java.sun.com/products/jdbc/download.html.


Getting ready for our study

To perform our study of the JoinRowSet interface, you need a database and some tables. We?ll pretend that we are dealing with a database that holds employee information such as an employee?s name, salary information, as well as an ID of the employee?s department information. The department information of our pretend company will be separated into a different database table and will house information including the department IDs.

Table 1. The EMPLOYEE table
EMPLOYEENOLASTNAMEFIRSTNAMEDEPTIDSALARY
111SmithTerryLAB36,343.00
222RuffinKenLAB49,425.00
333BrownHillaryHR57,626.00
444NguyenJenACC65.312.00
555WallaceChristineLEG19,453.00
Table 1. The DEPARTMENT table
DEPTIDDEPTDESCRIPTIONDEPTLOCATION
LABLab OperationsChicago
ACCAccountingToronto
LEGLegalMiami
TRATravelBoston

A key thing to notice here is the existence of a DEPTID column in both tables.

Before we move on, let?s go ahead and create a database and a couple of tables in DB2 to house the data shown above.

Using the DB2 command line processor, create a database a database named JOINDB. I am going to assume that you have a DB2 administrator defined user named db2admin having a password of db2admin:

db2 => connect to JOINDB user db2admin using db2admin

Let?s move on and create our EMPLOYEE table with the command:

db2 => create table EMPLOYEE (EMPLOYEENUM int primary key not null, 
  LASTNAME varchar(30) not null, 
  FIRSTNAME varchar(30) not null, DEPTID varchar(5) not null, 
  SALARY decimal (8,2) NOT NULL)

To add values to this table, use the syntax:

db2 => insert into EMPLOYEE values(111,'Smith','Terry','LAB',36343.00)

Go ahead and populate the EMPLOYEE table with the rest of the rows shown in Table 1 using insert syntax like that above.

Next, let?s create our DEPARTMENT table with the command:

db2 => create table DEPARTMENT (DEPTID varchar(5) primary key not null, 
  DEPTDESCRIPTION varchar(45) not null, DEPTLOCATION varchar(45) not null)

And follow up with populating the DEPARTMENT table with the data delineated in Table 2 with an INSERT command such as:

db2 => insert into DEPARTMENT values('LAB','Lab Operations','Chicago')

Reviewing what a JOIN does

Before we delve deeper into our JoinRowSet study, let?s issue a simple JOIN command to review what it can do for us. Later, we?ll see how the JoinRowSet function performs the same data-combining operation that our database server side JOIN operation did.

Go ahead and issue the command:

db2 => select * from EMPLOYEE inner join DEPARTMENT on EMPLOYEE.DEPTID=DEPARTMENT.DEPTID

You should get back the following:

Table 3. Results of the server-side JOIN query
EMPLOYEENOLASTNAMEFIRSTNAMEDEPTIDSALARYDEPTIDDEPTDESCRIPTIONDEPTLOCATION
111SmithTerryLAB36,343.00LABLab OperationsChicago
222RuffinKenLAB49,425.00LABLab OperationsChicago
444NguyenJenACC65.312.00ACCAccountingToronto
555WallaceChristineLEG19,453.00LEGLegalMiami

Notice that the INNER JOIN statement was issued using the on EMPLOYEE.DEPTID=DEPARTMENT.DEPTID clause. This effectively combined the data contained in our two tables (EMPLOYEE and DEPARTMENT) based upon a common attribute (the DEPTID value). Use of the INNER JOIN statement excluded the EMPLOYEE row with EMPLOYEENO=333 since there was no DEPTID in the DEPARTMENT table with a DEPTID of HR. Similarly, we do not see a row containing the DEPTID named TRA since no employee in the EMPLOYEE table had a DEPTID of TRA.

Using JoinRowSet lingo, our JOIN operation was performed using a match column of DEPTID. To be a match column, the column of interest needs to be residing in the tables which are being joined.


Using the JoinRowSet

Now that we have reviewed what our JOIN operation does on the server side, let?s see how the JoinRowSet makes a JOIN possible on the client side while being disconnected from the database server. This analysis might best be exemplified by some Java code samples. All of the code segments are part of the Java class associated with this article: JoinRowSetExample.java.

JoinRowSet objects can be populated with any javax.sql.RowSet object (given that the RowSet object can be part of an SQL JOIN operation). The CachedRowSet object interface, which is a subinterface of the RowSet interface, falls under the eligible RowSet objects allowed. If you are not already familiar with the usage of the CachedRowSet, I suggest that you familiarize yourself with the topic by reading my CachedRowSetarticle on developerWorks. In the code below, I create two CachedRowSet objects using the CachedRowSetImpl (implementation) class, one for the EMPLOYEE table as well as one for the DEPARTMENT table:

Class.forName("com.ibm.db2.jcc.DB2Driver");
CachedRowSet employees = new CachedRowSetImpl();
employees.setUsername("db2admin"); 
employees.setPassword("db2admin"); 
employees.setUrl("jdbc:db2://localhost:50000/joindb"); 
employees.setCommand("SELECT * from EMPLOYEE");
employees.execute();

CachedRowSet departments = new CachedRowSetImpl();
departments.setUsername("db2admin"); 
departments.setPassword("db2admin"); 
departments.setUrl("jdbc:db2://localhost:50000/joindb"); 
departments.setCommand("SELECT * from DEPARTMENT");
departments.execute();

In the code above, I populate the CachedRowSet objects by specifying connection information and an SQL query. As an alternative, I could have populated the CachedRowSet object using an existing ResultSet object. Note that to run the sample application I provide, you will need to include the DB2 Universal JDBC driver (db2jcc.jar) in your runtime classpath. You will also need to include the db2jcc_license_cu.jar file. For more information about setting up your environment, refer to my article: Hooking Up with DB2 Universal Database Version 8 using Java.

Now that we have our two CachedRowSet objects which embody the table data of both the EMPLOYEE and DEPARTMENT tables, we are ready to create our JoinRowSet object. But before we do, this, we will demonstrate the ability of being able to disconnect from the database to perform an offline join operation. The sample program is designed to pause and let you literally perform the dramatic operation of stopping DB2 to truly grasp the benefit of the JoinRowSet offering. When the sample program prompts you to stop DB2, you can use the following commands in a new DB2 command line processor window:

db2 force applications all

with a subsequent

db2stop

to force a stop of DB2. Next in our code we perform our offline JOIN:

JoinRowSet join = new JoinRowSetImpl();
join.addRowSet(employees,"DEPTID");
join.addRowSet(departments,"DEPTID");

First I create a JoinRowSet object named join by calling the constructor of the implementation class, JoinRowSetImpl:.

Next, I use the addRowSet method of the JoinRowSet class by calling the addRowSet method, which takes the name of the RowSet object I want to add as the first argument and the name of my match column as the second argument. As you can see, I have specified a match column of DEPTID for both the employees and departments objects.

Alternatively, we can do the same thing we did in the code above by using the setMatchColumn method of the Joinable interface. Since the CachedRowSet implementation class CachedRowSetImpl implements the Joinable interace, we have the ability to call the Joinable method setMatchColumn:

JoinRowSet join = new JoinRowSetImpl();
((Joinable)employees).setMatchColumn("DEPTID");
((Joinable)departments).setMatchColumn("DEPTID");
join.addRowSet(employees);            
join.addRowSet(departments);

After performing our offline JOIN, we can see the data in our JoinRowSet object using getter methods like we do with ResultSet objects. We can get a column?s value either by specifying the column index as an integer or the column name as a String object. I show both approaches in the code:

// display records in JoinRowSet object
while (join.next())
{
 	System.out.println(join.getInt(1) + " - " + 
            join.getString("LASTNAME") + " - " + 
            join.getString(3) + " - " +
            join.getString("DEPTID") + " - " +
            join.getFloat(5) + " - " +
            join.getString("DEPTDESCRIPTION") + " - " +
            join.getString(6));}

The output of our disconnected JoinRowSet object correctly parallels the data we obtained when we performed our server side database JOIN query (compare to Table 3):

555 - Wallace - Christine - LEG - 19453.0 - Legal - Legal
444 - Nguyen - Jen - ACC - 65312.0 - Accounting - Accounting
222 - Ruffin - Ken - LAB - 49425.0 - Lab Operations - Lab Operations
111 - Smith - Terry - LAB - 36343.0 - Lab Operations - Lab Operations

Modifying your disconnected JoinRowSet object

Since the JoinRowSet interface is a subinterface of the CachedRowSet interface, we inherit the abilities to modify (insert, delete, and update) records represented by a JoinRowSet object offline and resynchronize with our database just as we would if we were using CachedRowSet object. You can learn more about how to perform data inserts, deletes, and updates in my CachedRowSetarticle. The key benefit that we gain from the JoinRowSet interface?s relationship with the RowSet interface is the ability to perform our data manipulation offline and then resynchronize our changes with our database.


Crossing and joining database borders

An interesting capability that the JoinRowSet seems to allow is the ability to perform offline JOIN operations across databases (even of different vendors). I didn?t see this documented anywhere so I would use the technique with caution, but I was able to perform a JOIN operation between a MySQL database table and a DB2 database table. This was made possible due to a common Match column that resided in database tables of both MySQL and DB2. Both columns need to have the same column definition (for example varchar(45) if the DEPTID column resided in a MySQL database table named DEPARTMENT as opposed to a DB2 housing database.


JoinRowSetImpl limitations and pitfalls

The JoinRowSetImpl class had a bug in it at the time of this article?s writing that rendered the ability to specify multiple columns for the MatchColumn as useless. I would think that future releases of the class will fix the bug. If it were working, you would be able to specify one or more columns as the matching columns for an offline join operation.

Also at the time of this article?s writing, the JoinRowSetImpl class only implemented the ability to perform an inner JOIN like that showcased in this article. Those familiar with the power of JOIN operations are probably asking "What about LEFT JOIN, RIGHT JOIN, and so on?" Hopefully, future incarnations of the JoinRowSetImpl will cover these JOIN types also, but as it stands, we are only provided the ability to INNER JOIN.


Conclusion

As you have seen in this article, the JoinRowSet interface allows for Java applications to perform JOIN operations on disconnected objects, not requiring an ongoing database connection. Using the JoinRowSet in your applications properly can offload the stress placed upon a database server that a JOIN operation would normally impose. The stress of the JOIN operation is polarized towards client applications.


Download

DescriptionNameSize
JoinRowSet code sampleJoinRowSetExample.java  ( HTTP | FTP )10 KB

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management, Java technology, IBM i
ArticleID=88289
ArticleTitle=Using the Java JoinRowSet implementation with DB2 UDB
publish-date=07072005