Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

DB2 connection routing using Linux load balancing

Monty Wright (montywri@us.ibm.com), Certified IT Specialist, IBM Advanced Technical Support
Author photo
Monty Wright is a certified IT specialist and has been supporting the DB2 brand for 7 years. He has published several articles on DB2 distributed topics and authored the DB2 performance tuning section of the IBM Redbook Tuning IBM eServer xSeries Servers for Performance. As part of the IBM Advanced Technical Support team, he has participated in numerous customer engagements and benchmarks. His specialty areas include performance tuning, database partitioning, and high availability.

Summary:  Spread DB2 client connections across multiple servers with Balance, a powerful open source tool for the Linux platform. This article tells you what DB2 UDB connection routing is and explains how you can use it to balance your client requests on the Linux platform.

Date:  28 Oct 2004
Level:  Introductory

Activity:  7609 views
Comments:  

Introduction

For any database, the first step to accessing data is making a connection from the client to the server. You may need to establish connections from clients to your IBM® DB2® Universal Database™ server across a wide area network or from the Web. If you have many connections coming in, with the potential for overloading your network or server at peak times, you may be looking at implementing a load balancing solution to spread requests across multiple servers. The flexibility of routing these connections by using facilities outside of DB2 has many potential applications. This article discusses one method of routing TCP/IP DB2 connections using an open source TCP proxy utility called Balance.


Understand the background

To understand how the routing utility will affect DB2 connection attempts, you will need to have an understanding of how DB2 TCP/IP connections are established. Let's examine the process from two sides, the server side and the client side.

Server side

Server side TCP/IP communication is set up on an instance-by-instance basis. In DB2 terminology, an instance is a hosting environment for one or more databases. Instance communication setup affects all databases in that instance. Essentially a DB2 instance listens to a TCP port for connection attempts and to communicate with clients. The specific TCP port that will be used is determined by a single instance configuration parameter, SVCENAME. If more than one instance exists on a single server, each must utilize a different TCP ports.

The three potential settings for the SVCENAME parameter are: it could be blank, it could be set to a port name, or it could be set for a port number. Setting the parameter to blank effectively disables TCP/IP communication for the instance. If set to a port name, then a corresponding entry must exist in the services file (operating system-specific port listing file). The services file can be found according to the following table.

Table 1. Location of services file

Operating System Location
Microsoft Windows®C:\windows\system32\drivers\etc\services
or
C:\winnt\system32\drivers\etc\services
UNIX® and Linux/etc/services

Simply specifying an available port number for this parameter is also possible. To enable the instance to communicate via the configured TCP/IP port, the DB2 registry variable DB2COMM must include the 'tcpip' option.

If you look in the services file on your platform, you may find several DB2 entries. These entries may have been created by several means: initial DB2 install and setup, db2icrt command, db2isetup utility, or by an administrator. In general you will not need to modify the existing entries. The DB2 TCP/IP naming convention for client/server communication ports is:

db2c_xxxx #####/tcp

where 'xxxx' is the name of the instance and '#####' is the port number. However, no particular form is required, and the entry for port name simply needs to be identical to the SVCENAME parameter.

The other entries that may have the form beginning "DB2_xxxx" or "DB2_xxxx_#" are for other DB2 purposes and are not related to client/server communication. Again, if you simply choose to provide an available port number in the SVCENAME parameter, then it will not be necessary to have a named communication port in this file.

Changes to these configuration parameters will require a restart of the instance to take affect.

The DB2 administration service communicates via port 523. The Control Center and Configuration Assistant, DB2 GUI administration tools, work with this service to administer DB2 instances and databases.

The netstat command can be used to determine if DB2 is listening for connections. From a shell environment or a command window, issue the command netstat -a to list the TCP ports on which the system is listening. You can filter the results by the SVCENAME parameter to determine if the instance is listening. For example:

SVCENAME = db2c_db2inst1
Use the command:
=> netstat -a |grep db2c_db2inst1


Figure 1. Filter results of netstat with grep
Filter results of netstat with grep

or, SVCENAME = 50100
Use the command:
=> netstat -a |findstr 50100


Figure 2. Filter results of netstat with findstr
findstr

On Windows platforms, findstr perform the equivalent to the grep function on UNIX to search for a substring.

For more information on the setup of DB2 server communications, refer to DB2 online documentation at http://publib.boulder.ibm.com/infocenter/db2help/index.jsp. The topic "Configuring TCP/IP communication for an instance" can be found by searching for "Configuring TCP/IP".

Client side

DB2 client side setup involves configuring the client to communicate with the correct port on the server. To accomplish this task, you place an entry in a special configuration file known as the node directory. The node directory entry must contain the key information hostname (IP address is also acceptable), and communication port (this may be in the form of a port name or number). There is one other entry we need to make before we are ready to connect. The database directory is another special configuration file on the client; it lists what databases are known to the client and what entry in the node directory provides the necessary information to establish a connection. We will work with both of these directories in some later examples. These configuration files are not directly manipulated by the user but are updated by means of the DB2 GUI tools or command line utilities.


Figure 3. Example of database directory entry
database directory

Figure 4. Example of node directory entry
node directory
If a port name is used, then the port entry in the services file must be present on the client as well as the server.

The database directory shows us that database TEST uses node entry NODE_A. NODE_A in the node directory shows us the hosting server is gatesrv, and is listening on port 50100.


The Balance utility

The Balance utility is a GNU General Public License utility available as a free download from http://balance.sourceforge.net. In essence it is a simple but very powerful port forwarding and load balancing tool with failover mechanisms for the Linux platform.

The utility bases its load balancing around the concept of channels. A channel consists of a host address and a port. Channels can be grouped together to create a channel group. You configure Balance to forward connection attempts on a specified port to a channel group. If more than one channel exists in a group, round robin (default) or hash load balancing is performed. If a channel of the channel group is not available, then the next channel in the group is attempted. The forwarding host post is not required to be the same port being controlled by Balance. There are other options, such as setting timeout values for connections, setting maximum number of connections, and setting affinities between incoming IP address connection attempts and destination channels.

Example Setup

To illustrate the capabilities of Balance and DB2, I will use an example scenario, illustrated in Figure 5. While the example will not be exhaustive it should allow you the basic feel of the combined products' capabilities.


Figure 5. Example Balance configuration
Balance configuration

In the example I have two clients, clientA and clientB, that are connecting to databases hosted by three different servers, db2serv1, 192.168.1.101, and db2serv3. The gatesrv server is providing port forwarding for port 50100 to the three servers. The gatesrv server requires no additional software to perform this task other than the Balance utility. It is also not necessary to run the application on a dedicated server but I've used a dedicated server here for clarity.

The clients are connecting to a database named test. How to communicate to the sample database is directed by the node directory entries, NODE_A on client A and NODE_B on clientB. I chose differing node entry names for illustrative purposes only; they could be the same name and likely should be for client setup simplicity. The node entries point to gatesrv server, rather than to a traditional database server. The clients are configured to connect to port 50100 on gatesrv. The server gatesrv has three defined channels, one for each server, creating a channel group. It will direct connection attempts to the three servers.

The servers differ in several ways, including operating systems, instance names, host names, and TCP listening ports; again, this is for illustrative purposes. The only necessary difference is hostname or IP address designation, naturally to differentiate one server from another. The example also illustrates that each server is hosting separate databases. This would not be the case for a DB2 Connect gateway implementation, which I'll discuss later. Connection attempts from the clients can either be round robin routed or hash routed. Round robin would distribute connection attempts evenly among the channels, while hashing would use the client's IP address to determine the destination channel. Hashing allows client connect attempts to always be routed to the same server.

Hosting separate databases is an important issue. If the connecting applications are performing insert, update, or delete activity, then all the databases will not have the same content. If they are read-only databases then this would not be an issue. If you require the databases to be synchronized then you must plan for this process. There are many methods to accomplish synchronization such as DB2 replication. You must be aware of or account for the asynchronous nature of replication if this is your chosen method.

The steps to create the above example are very simple and are provided here. All DB2 commands should be performed under the instance ID, or another member of the SYSADM group, and as the root user for the Balance utility. We will be using the DB2 Command Line; however this could also be accomplished by using the GUIs Client Configuration Assistant (for client configuration) and Control Center (for server configuration).

On the clients:


clientA
   => db2 CATALOG TCPIP NODE node_a REMOTE gatesrv SERVICE 50100
   => db2 CATALOG DATABASE test AS test AT NODE node_a
			


clientB
   => db2 CATALOG TCPIP NODE node_b REMOTE gatesrv SERVICE 50100
   => db2 CATALOG DATABASE test AS test AT NODE node_b
			

To check the entries use the two following commands

				
   => db2 LIST NODE DIRECTORY
   => db2 LIST DATABASE DIRECTORY
			

On the servers:


db2serv1
   => db2 UPDATE DBM CFG USING SVCENAME 50100
192.168.1.101
   => db2 UPDATE DBM CFG USING SVCENAME 50100
			


db2serv3
   => db2 UPDATE DBM CFG USING SVCENAME 60000
			

(Don't forget to restart DB2 after changing this parameter)

On the gateway:


gatesrv
   => balance 50100  db2serv1 192.168.1.101 db2serv3:60000
			

In reality without the variations that I made to demonstrate the configuration options this could be simplified to the same two commands for all the clients, the same command for all the servers and the single command for Balance.


Various implementation scenarios

DB2 Connect™ gateway load balancing

One natural fit for the Balance application and DB2 is to load balance DB2 Connect gateways. DB2 Connect is unique in that it makes DB2 host (zSeries® and iSeries™) data available to distributed systems. While the DB2 Connect gateway does not host the data, it does perform translation work on behalf of the connecting clients. If you wish to load balance this work or provide an alternate path for connecting applications, then the Balance utility may suit your needs. The illustration below shows how this might look.


Figure 6. DB2 Connect in a gateway load balancing configuration
DB2 Connect gateway in load balancing configuration

In the event one DB2 Connect gateway was unavailable, Balance would successfully route connections through the alternate. To implement true high availability, the Balance application would need to be made highly available. This would not be a difficult task but is not covered here.

Client routing during maintenance activities

Routing connection attempts away from a server is another possible benefit of using Balance to manage your DB2 connections. Since the Balance program has the option to be configured dynamically, removing or adding servers from Balance’s routing table will not affect connections to other servers. This might be useful for performing system maintenance tasks.


Figure 7. Client routing during maintenance
Client routing during maintenance

The dotted lines suggests that I can remove ServerB after adding ServerA, or I can simply remove ServerB and add ServerA to quickly reroute connection attempts. If I remove routing to a server, all connections to that server are lost.

Database hosting

Balance is primarily used as a tool to load balance http requests. As clients make requests for Web pages, a farm of servers can serve up the necessary content. The same principle could be applied to database hosting. This may be especially effective when the database is or is largely read-only. One example may be a travel Web site where there is a very large "look to book" ratio. That is, people may perform a great many searches for accommodations but make reservations very infrequently. One issue to be aware of is that you must keep databases synchronized when bookings are made, or create a separate process for the reservation system.


Figure 8. Balance Web requests
Balance Web requests

Alternatives to Balance

DB2 Version 8.2 introduced a new feature, client reroute, enhancing DB2's native client/server communication abilities. To implement this feature you will need DB2 Version 8.2 clients and servers. The essential function of DB2 V8.2's client reroute capability is to move a database connection from one sever to another. This is performed transparently to the user or application.

More information on this release of DB2 can be obtained at http://www.ibm.com/software/data/db2/udb/v82/.

Another interesting feature of DB2 is the ability to "hop" from one database server to another. The database server is usually the final destination of client connection attempts. On occasion the database administrator may wish the clients to use a different database server. Changing the clients' configuration may be either impossible or impractical. I have demonstrated how this might be handled using Balance. DB2 can accomplish a similar function by the database server pointing to an alternative hosting database server. Accomplishing this "hop" simply requires an entry in the DB2 database directory for the alternative host in place of local entry.

Figure 9 shows the database directory for a database, Sample, hosted on the local machine noted by the "indirect" entry type.


Figure 9. Database directory entry for a local database
local database directory entry

If you wanted Sample to be hosted on another server, you need the entry type to be "remote". You will need to create a TCP/IP node entry for the remote server in the node directory, uncatalog this database directory entry, and add remote database directory entry, as shown in Figures 10, 11, and 12.


Figure 10. Catalog the TCP/IP node
Catalog TCP/IP node

Figure 11. Uncatalog the local database
Uncatalog the local database

Figure 12. Catalog the database at the remote node
Catalog the database at the remote node

Summary

The Balance utility provides additional flexibility when developing DB2 solutions. While the use of this tool does not fit all data management scenarios, it does provide a powerful means of distributing data workload. When you evaluate the tool, don't forget to test for data synchronization and Balance availability if a high availability solution is required. You will find this to be one more open source tool that can be used to add power and flexibility to your database solution.


About the author

Author photo

Monty Wright is a certified IT specialist and has been supporting the DB2 brand for 7 years. He has published several articles on DB2 distributed topics and authored the DB2 performance tuning section of the IBM Redbook Tuning IBM eServer xSeries Servers for Performance. As part of the IBM Advanced Technical Support team, he has participated in numerous customer engagements and benchmarks. His specialty areas include performance tuning, database partitioning, and high availability.

Report abuse help

Report abuse

Thank you. This entry has been flagged for moderator attention.


Report abuse help

Report abuse

Report abuse submission failed. Please try again later.


developerWorks: Sign in


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

Choose your display name

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


Rate this article

Comments

static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=24190
ArticleTitle=DB2 connection routing using Linux load balancing
publish-date=10282004
author1-email=montywri@us.ibm.com
author1-email-cc=