IBM DB2® Universal Database™ (DB2 UDB), Version 8.2, Fixpack 10 introduces server-assisted client redirect (SACR), a feature that significantly improves the scalability of applications that use UDB with data-partitioning and WebSphere Application Server. With DB2’s traditional data-partitioning function, clients always contact a central coordinator, and the coordinator forwards requests to the DB2 node containing the needed data. But with server-assisted client redirect, the first node contacted redirects the client to the appropriate node and the client directly contacts that node for all remaining requests within the transaction. Most applications will benefit from the reduced communication overhead, and some will realize very significant performance improvements. This article first describes SACR and how it can be used to build applications that are highly scalable across the number of DB2 servers and WebSphere Application Server clients. It then explains which applications will benefit most from SACR, how to partition data across nodes in order to use SACR, and how to enable SACR for an application using the WebSphere administration console.
Overview of server-assisted client redirect
DB2’s data partitioning feature allows large tables to be distributed across multiple nodes or partitions. When a table is created, the user specifies a partitioning key, that is, a column or set of columns that will be used to partition the rows in the table. If a good partitioning key is chosen, the data will be evenly distributed across the nodes, and for most, if not all, queries, the partitioning key will uniquely identify the node containing the data.
A central coordinator node handles requests from all clients. For each query, it tries to determine which node contains the appropriate data. If it can identify the correct node, it forwards the request to that node, waits for a response from the node, and returns the data to the client. Subsequent requests are treated similarly: the clients always contact the central coordinator node, and the coordinator node retrieves the data from the correct node. If the coordinator node is unable to identify the correct node, it broadcasts the query to all nodes, and then coalesces the responses it receives from them.
Figure 1 demonstrates the case in which a transaction contains two queries, each of which accesses data on different nodes. Eight remote calls are needed to complete the transaction.
Figure 1. Without SACR enabled, and data distributed randomly across nodes
As Figure 2 displays, the number of remote calls is identical if the data is on the same node.
Figure 2. Without SACR enabled, and data all on one node
In many cases, this second scenario is more realistic, meaning that requests are not randomly distributed and that the node that contains the data for one query is more likely to contain the data for the next query from the same client. It is this type of transaction that can take advantage of server-assisted client redirect to reduce communication costs.
When SACR is enabled, DB2 chooses a coordinator for the transaction based on the data location of the first statement. If the first node contacted by the client happens to contain the data needed for the first statement in the transaction, that node will serve as the coordinator node for the remainder of the transaction. But if it does not, it redirects the client to the node containing the data. The client then resends its request and all remaining requests in the transaction to this new node, and it returns the data directly to the client. Redirection takes place a maximum of once per transaction. Ideally all remaining queries in the transaction involve data on the current node, but if data on other nodes is needed, the current node will retrieve the data from the other nodes. When the client begins its next transaction, it contacts the same node, possibly to be redirected once again.
Figure 3 demonstrates that even in the worst-case scenario -- the remaining queries in the transaction are randomly distributed across the nodes -- the number of calls is identical to the case in which SACR is not used, and response time is no worse.
Figure 3. SACR enabled, and data distributed randomly across nodes
Figure 4 shows a more likely scenario -- that the second query in the transaction needs to access data on the same node as the first query. It is here that the benefits of SACR become clear. Allowing the client to communicate directly with the node containing the needed data reduces the number of calls from eight to six in our simple example. This reduction in communications overhead will decrease response time and increase throughput.
Figure 4. With SACR enabled, and data all on one node
The more affinity the requests within a transaction have to a single node, the more an application will benefit from SACR. Figure 5 demonstrates that for transactions containing ten statements, all of which are 100% re-directable to a single node, response time improves 300% with SACR.
Figure 5. Transaction response times when statements are 100% redirectable to a single node
But applications with a greater mix of random and non-random queries can also see performance improvements from SACR. For example, Figure 6 shows the response time improvement when 50% of the statements are redirectable to a single node. If there are two queries per transaction, the improvement is 34%; for five or ten queries per transaction, the improvement is 10%.
Figure 6. Transaction response times when statements are 50% redirectable to a single node
When and how to use server-assisted client redirect
You should consider using server-assisted client redirect if your application is a Java-based OLTP application under WebSphere Application Server 6.0 or later and if you are using the DB2 Universal JDBC Driver Provider, either Type 2 or Type 4.
Using SACR requires two steps. First, you must configure your databases to use DB2 DPF partitioning with two or more nodes. When setting up DB2 DPF, you should select a partitioning key that meets the following requirements:
- The partitioning key should include the columns that are used most often as predicates in queries.
- The partitioning key should include the most frequently joined columns.
- The partitioning key should be made up of columns that often participate in a GROUP BY clause.
- Any unique key or primary key must contain all of the partitioning key columns.
- The partitioning key should not include a column with unevenly distributed data or with a small number of distinct values.
Second, you must enable SACR using the WebSphere administration console. Because SACR is application transparent, no changes are needed to applications in order for them to be used with SACR. Under the custom properties setting for each data source for which you want to enable SACR, add a Boolean property named ‘useTransactionRedirect’ with the value ‘true’.
Figure 7. WAS administration console showing that SACR is enabled
Learn
-
DB2 Administration Guide: Planning: Learn more about data partitioning with DB2.
-
DB2 Infocenter: Database partition groups: Find additional information about designing database partition groups.
-
WebSphere Application Server, Version 6.0 Infocenter: Get general information about configuring and using WebSphere.
-
developerWorks DB2 zone: Learn more about DB2. Find technical documentation, how-to articles, education, downloads, product information, and more.
Discuss
-
developerWorks
blogs: Get involved in the developerWorks community.




