An introduction to IBM DB2 Connect: It's more than meets the eye: The basics of DB2 Connect

Part 1 of 2

IBM® DB2® Connect™ has become the choice method for opening up DB2 for z/OS® databases and all of the traditional recognized benefits of the zSeries® hardware platform to the world of applications that run off the mainframe -- namely, distributed applications. This is the first article in a two part series that will give you an introduction into the main features of DB2 Connect that enhance your ability to deliver on demand solutions.

Share:

Leon Katsnelson, DB2 Development, IBM Canada Ltd.

Leon KatsnelsonLeon Katsnelson works in the IBM Toronto Lab where he manages a team of product managers responsible for the IBM DB2 UDB products. Leon is a recognized expert in the areas of DB2 application development and DB2 Connect. Leon has 20 years of experience with complex database and network systems. Leon has worked on DB2 Connect from very beginning and worked on this project as the product planner, development manager and product manager. He is often referred to "the father of DB2 Connect".



Paul Zikopoulos (paulz_ibm@msn.com), DB2 Competitive Technologies Team, IBM Canada Ltd.

Author photoPaul C. Zikopoulos, BA, MBA, is an award-winning writer and speaker with the IBM Database Competitive Technologies team. He has more than nine years of experience with DB2 and has written numerous magazine articles and books about it. Paul has co-authored the following books: DB2 Version 8: The Official Guide, DB2: The Complete Reference, DB2 Fundamentals Certification for Dummies, DB2 for Dummies, and A DBA's Guide to Databases on Linux. Paul is a DB2 Certified Advanced Technical Expert (DRDA and Cluster/EEE) and a DB2 Certified Solutions Expert (Business Intelligence and Database Administration). Currently he is writing a book on the Apache Derby/IBM Derby database. You can reach him at: paulz_ibm@msn.com.


developerWorks Professional author
        level

10 March 2005

Introduction

In 1993, pundits in the computer industry predicted the imminent demise of the mainframe. The computing infrastructure of the future, they pronounced, would be a highly distributed and loosely connected collection of personal computers and client-server systems. IBM was all but written off as a relevant player in the industry.

Well, we all know what happened. IBM managed to reinvent itself in distributed markets and put the "main" back in mainframe technology. From a pricing perspective, IBM reduced mainframe prices dramatically. On the technology side, IBM moved away from the bi-polar technology that powered its mainframes and placed a huge bet on CMOS chip technology as a way to deliver mainframe class computing at drastically reduced costs. More important, it directly addressed the myth of the mainframes as an outdated technology whose time has passed.

Today, more then ever, businesses place the mainframe as the foundation of their computing infrastructure. At the same time, Linux™, UNIX®, Windows®, and other client-server systems (herein referred to as distributed platforms) have not gone away just because mainframes regained their place in the enterprise.

Instead, these distributed computing infrastructures have evolved; the end result is that clients want to marry the ease and strength of distributed platforms with the unmatched strength of mainframe technology. If there is one area of Information Technology (IT) where this synergy has significant payoffs, it's in the area of database applications.

IBM DB2 Universal Database™ for z/OS (DB2 for z/OS) has transformed itself from being a mainframe database to the world's premiere database server for Web and client-server applications. In today's data centers, you're likely (some would say more likely) to encounter DB2 for z/OS being the database server for applications running on Windows, UNIX and Linux application servers as you are to come across CICS or COBOL applications running on the mainframe itself.

It is in this environment that we find the IBM DB2 Connect (DB2 Connect) product playing a central role. Today, DB2 Connect has become the de facto choice for opening up DB2 for z/OS databases and all of the traditional recognized benefits of the zSeries hardware platform to the world of applications that run off the mainframe -- namely, distributed applications.

Why has DB2 Connect succeeded where so many others have failed? This series of articles on DB2 Connect attempts to describe the key features that we believe separate DB2 Connect from competing solutions. There are undoubtedly many non-technical factors that set DB2 Connect apart from other solutions. For example, most customers prefer a single vendor solution because it provides them with a single point of contact for support: "one neck to choke" so to speak. Others like the fact that this solution comes from a large established provider like IBM that they can count on being around for a while (in this connectivity space, many vendors have come and gone or play the rename game). In this series of articles we will focus on technical aspects of a DB2 Connect solution and will leave the non-technical side to the IBM sales folks to address.

This is the first article in a two part series that will give you an introduction into the main features of DB2 Connect that can be leveraged to greatly enhance your ability to deliver on demand solutions. The focus of this paper is not to discuss DB2 Connect packaging and licensing. For this type of information, refer to Which Edition of DB2 Connect is Right for You? by Paul C Zikopoulos and Leon Katsnelson


What is DB2 Connect? An overview of this series

To really understand what DB2 Connect is, it's likely easier to tell you what it isn't. In the early days of Distributed Database Connection Server (DDCS), the precursor to DB2 Connect, many people described the product as the way to connect DB2 databases on distributed platforms to DB2 databases on the mainframe. So let's clear up one thing off the bat: DB2 Connect does no such thing nor are we are aware of any product that does what one would describe as "connect databases together". DB2 Connect's purpose in life is to connect applications to data, not one database to another database.

So, in the context of DB2 Connect, what is a database and what is an application? In the IT world when we talk about applications at a high level, we typically mean a system or a collection of systems that help us deliver business solutions. For example, we talk about CRM or ERP applications. For the purposes of a DB2 Connect discussion, we need to use the word application in a much narrower definition.

For this series of articles, when we refer to applications, we are really talking about just a portion of the overall application. More specifically, we mean the application code that implements the user interface (UI) and business logic (and how to get access to the data). In other words, when we say application in this series, we are talking about the actual code that issues queries against databases, processes results, and invokes transactions that change data in the database. For example, a Microsoft® Excel spreadsheet connected to a Siebel CRM application running on a Windows server and connecting to DB2, or a collection of JSPs and Java™ Beans running on a Linux WebSphere® server are examples of applications that we will be discussing in the context of these articles on DB2 Connect.

Notice that the examples above refer to computer code that does not run under control of the z/OS operating system. Instead, we are talking about computer code that runs under the control of Windows, Linux, and so on. This is really the essence of the DB2 Connect. It allows access to DB2 servers running under the control of OS/390®, z/OS, i5/OS™, OS/400®, VSE and VM from computer code that is running on other platforms (Linux, UNIX, and Windows). While DB2 Connect can connect to database servers on all of these platforms, we will be talking mostly about connecting applications to DB2 running under control of z/OS. Figure 1 provides a depiction of what DB2 Connect is really designed to do.

Figure 1. DB2 Connect provides applications running on distributed platforms with a mechanism to work with mainframe-managed data.
High level view of DB2 Connect.

Connecting your code to data with DB2 Connect

So, DB2 Connect is a piece of middleware that connects application code to data. How exactly does it do that?

When you write computer code that interacts with relational databases such as DB2, you have to consider two key items. First, you need to select a language that you will use to interact with the database. This will typically be SQL. Second, you need to select an application programming interface (API) that speaks the language and sends your SQL to the database for processing.

Anybody who has dealt with DB2 knows that DB2 supports a rich and powerful flavor of the SQL language. When using SQL with DB2 Connect, you are using this same SQL that you likely use for other applications. DB2 Connect simply delivers the SQL from the application to the DB2 database for processing. It does not rewrite it or alter it in any way. In other words, if you are skilled in building SQL applications, you're ready to write SQL applications on distributed platforms and have these applications work with DB2 on the mainframe.

Because the implementation of SQL is very compatible across the entire DB2 family of database servers, these applications can also work with DB2 servers on other platforms. But how does an application programmer writing an application in C, Java or Visual Basic.NET submit SQL for processing to DB2? After all, none of these programming languages know anything about SQL. This is where APIs come in. All modern programming languages support APIs for submitting SQL to relational databases.

Programmers familiar with mainframe SQL programming are used to embedded SQL. Embedded SQL is one example of an API that DB2 Connect supports. Other more popular SQL-based APIs found in the distributed computing world include: DB2 CLI, ODBC, OLE DB, ADO, ADO.NET, JDBC, and SQLJ. DB2 Connect supports all of these APIs. This means that an application programmer writing code for Linux, UNIX, Windows, and other distributed platforms (for example, Mac OS, more on this later when we discuss Java) have a means of submitting SQL to DB2 using that API that they choose to work with. Furthermore, they can develop their application code in the programming language of their choice, be that Java, Visual Basic.NET, C#, COBOL, C++, etc.

Most of us have heard of ODBC, JDBC and other such interfaces referred to as drivers. While it may be common to hear the phrase ODBC driver, it is not exactly the same as the term ODBC API (and shouldn't be construed as such). The same can be said about other interfaces.

For example, the ODBC API is a specification of a large collection of function calls that programmers can make to databases. For example, a C programmer would call the SQLExec function to send an SQL statement for execution to a relational database management system (RDBMS). The SQLExec function and other ODBC APIs are implemented by the ODBC driver.

DB2 Connect drivers

DB2 Connect delivers the drivers that implement all the APIs mentioned in the previous section. However, not all APIs are implemented using a driver. For example, embedded SQL is implemented via a precompiler that reads C or COBOL source code and substitutes SQL calls with C or COBOL function calls that C or COBOL compilers understand. These functions are implemented as part of the libraries provided by DB2 Connect. Some APIs are implemented by the same driver and for several APIs there are multiple drivers available (for example, JDBC). Table 1 summarizes the different APIs and the drivers that come with DB2 Connect.

Table 1. APIs and drivers that come with DB2 Connect
APIDriver
Embedded SQLLanguage specific pre-compiler
DB2 CLIDB2 CLI driver
ODBCDB2 CLI driver
OLE DBDB2 OLE DB provider
ADODB2 OLE DB provider
ADO.NETDB2 NET Data provider
JDBCDB2 JDBC driver (several types are offered)
SQLJDB2 JDBC driver (several types are offered)

The fact that DB2 Connect delivers this very comprehensive collection of drivers leads to likely the second biggest misconception of what DB2 Connect is all about: people often believe DB2 Connect is just a bag of drivers and that is where this product's function ends. The basis for this belief likely comes from the fact that the API drivers, and their associated functionality, performance, and robustness, are so central to developing applications that work with databases, that product investigation sometimes ends there.

Indeed it doesn't help that this is the kind of message that some competing vendors like to propagate about DB2 Connect for the simple reason that their solutions happen to be just a bag of drivers.

Let's talk a bit about the drivers in DB2 Connect since they are so important to your applications. It is important to note that DB2 Connect delivers the broadest collection of drivers in the industry. We believe these drivers provide the highest levels of standards compliance, performance, and robustness in the industry. In fact, the DB2 Connect drivers have become the gold standard by which all other DB2 drivers in the marketplace are measured.

It seems that every year or so, there is some new (or renamed) vendor that makes wild claims of being two, three, even ten times faster than the drivers that come with DB2 Connect. Invariably, these claims are proven to be false and these vendors simply disappear or are relegated to minor roles in the overall market. The richness of the DB2 Connect drivers come from the fact that it's been around since 1993 and continually proves its mettle in the world's most demanding applications.


An efficient architecture for the DB2 Connect drivers

To implement its drivers, DB2 Connect has a unique infrastructure that adds to the reliability, scalability, and robustness of its drivers -- we'll talk about them in this section.

DB2 Connect and the zSeries Workload Manager

As important as the drivers are, the way that application connectivity to the mainframe is implemented is even more vital. Unlike traditional client-server systems utilizing distributed servers, mainframe servers are not designed for single applications. In fact, one of the key benefits of a mainframe like the zSeries is that they are inherently multi-workload systems. For example, when working with a shared system such as a mainframe, it is important that distributed applications that use the resources of your system participate in the management of the workload and "play nice" with the other applications that are also using the system. The unfortunate fact (for customers that buy some of the alternative products available in the market today) is that other vendors pay little attention to this requirement.

DB2 Connect, on the other hand, has the functions for minimizing resource usage and participating in the mainframe's workload manager built right into the product. For example, DB2 Connect is fully integrated with the z/OS Workload Manager (WLM) and marshals work to different LPARs on the mainframe based on the directions that it receives from the WLM. As a result, every transaction that comes through DB2 Connect is assigned mainframe resources based on a global system view of resource availability and utilization as opposed to just forcing the work on a random LPAR that may already be overloaded or may have other higher priority work already assigned to it.

DB2 Connect and the connection concentrator

All SQL is processed in DB2 in the context of a DB2 thread. Threads are a finite and limited resource. In traditional mainframe applications, thread usage is not a big concern as work is initiated by a handful of applications using a handful of threads. When DB2 for z/OS is used as a server for applications running on distributed platforms however, thread utilizations becomes a very important concern. Imagine thousands of desktop PCs running applications with one or two connections to the database. If you take into consideration that most connectivity methods will assign a dedicated thread to each connection, you don't have to be a math genius to very quickly realize that this is not a sustainable architecture.

DB2 Connect offers a connection concentration feature that multiplexes many database connections on a much smaller set of mainframe threads that handle the work coming from these connections. An administrator decides how much resource (for example, how many threads) they want to allocate to an application. This DB2 Connect feature leads not only to predictable resource consumption that administrators can plan for in advance (this is a good thing, these folks don't like surprises) but also to much better utilization of existing capacity even when the workload is low.

For example, concentrating or multiplexing multiple connections on a single thread allows for a reduction in the number of overall threads. This frees mainframe resources that may have otherwise been used for threads handling barely active client connections. Now you can take advantage of this newly available resource and put it to good use such as increasing the EDM pool size, thereby increasing EDM pool hit ratios and reducing the number of statement prepares. The end result is improved application response times.

The connection concentrator is a feature that is unique to DB2 Connect. It should not be confused with "connection pooling" which DB2 Connect also provides. The key difference between connection pooling and the connection concentrator is the fact that connection pooling provides reuse of connection resources by an application when another application disconnects. The connection concentrator function in DB2 Connect on the other hand does not require an application disconnect for the resources to be reused. In today's world of application servers that establish long running connections (sometimes measured in weeks or months) connection pooling has become almost irrelevant and connection concentration is the must have function - especially in high volume environments. DB2 Connect stands alone in delivering this function to the market.

DB2 Connect and data sharing

DB2 for z/OS has become a benchmark for high availability databases -- even database vendor competitors try to claim that they have mainframe levels of availability -- that's quite the compliment for the mainframe. The mainframe's reputation for availability has been largely achieved due to the availability of the data sharing support on DB2 for z/OS.

Data sharing allows for higher availability. If one DB2 subsystem goes down, or brought off-line for maintenance, other DB2 subsystems in the data sharing group take on the workload of the data sharing member that is now offline. Having a database available twenty-four by seven without applications being able to capitalize on this availability is of no use. Oddly enough, this is what you would get with many other driver vendors in the market today. On the other hand, DB2 Connect is constantly aware of the health and the status of a data sharing group and can provide transparent routing around failures.

For example, when a DB2 subsystem is down, DB2 Connect knows that its resources are unavailable and it will redirect transactions to surviving members of the data sharing group. The relocation of this work is completely transparent to any new transactions that are submitted to the server. Transactions that were in flight at the time of failure get cancelled and an application receives an SQLCODE -904 (resource is not available) error code. Once the application resubmits the cancelled transaction (which you can programmatically handle in your application), it is routed to a functioning DB2 subsystem.

DB2 Connect and the network

The network connecting application systems with the mainframe is another important piece of the infrastructure that should be considered. Proper utilization of this pipe can have a profound effect on the response time, throughput, and reliability of the application and the system overall. DB2 Connect has many features, though invisible, that make a big difference in how network connectivity is used.

For example, DB2 Connect uses result set blocking. When results of a query are sent back to a requesting application, they can be sent a-row-at-a-time every time a requesting application issues a FETCH request. Alternatively, a number of rows can be sent back together in anticipation of the application fetching additional rows. DB2 Connect sends back a block of rows (32 KB) whenever possible. This greatly minimizes the number of times requests and responses have to be sent over the network. More importantly, it greatly reduces the time it takes to get the next row.

Another feature that helps improve response times is next block pre-fetch. Not only does the DB2 Connect server receive rows in blocks, it also requests multiple blocks if it can. When DB2 Connect receives a 32 KB block of data and it knows that there is more data to follow, it will automatically request the next block. This block will arrive as the application is fetching the rows from the first block. Once the first block is exhausted, the application is immediately given the rows from the second block that were pre-fetched as DB2 Connect moved ahead and requests the next block. The end result is that the application is never waiting for data as it is always available in local memory to be fetched.

DB2 Connect's optimized network communications infrastructure include the ability to minimize network flows related to cursor (result set) management. When DB2 Connect detects that there are no more rows in the result set (for example, it, it received a partially filled block) it will close the cursor on the DB2 for z/OS server. When an application instructs DB2 Connect to close its cursor, DB2 Connect does not have to send any commands to the mainframe server. Rather, it immediately responds to the application that this request has been completed.

All of the DB2 Connect network optimizations outlined in this section are vitally important to a healthy and scalable system. Communication networks in general are much better suited for sending larger amounts of data in a single network flow rather than sending many smaller transmissions. With the advent of Gigabit Ethernet, the TCP/IP protocol now provides an ability to send very large frames (called jumbo frames). DB2 Connect fully utilizes this capability (if turned on) by automatically adjusting the size of the packets it sends or is prepared to receive until it arrives at the optimal size. This feature is particularly useful in environments such as replication and ETL where large data sets are transferred over the network. Moreover, it can coordinate this activity with the DB2 for z/OS continuous block streaming support to ensure data is continually transferred with minimal churning around the telecommunication line.


Wrapping it up

In the first part of the article, we touched upon the different programming interfaces that DB2 Connect provides and the drivers that implement these interfaces. In the last few pages we scratched the surface of the communication infrastructure that DB2 Connect delivers and we saw how this infrastructure greatly reduces the usage of mainframe resources and allows distributed applications to fully exploit the strengths of the mainframe platform (for example, the ability to easily manage mixed workloads and provide continuous application availability).

What you've learn so far helps us start our journey to understanding just what DB2 Connect really is. After reading this article, it should be clear that DB2 Connect is a potent combination of application enablement APIs and a robust communication infrastructure for enabling distributed applications to use mainframe DB2 servers as shown in Figure 2.

Figure 2. DB2 Connect provides an infrastructure for connecting applications to data.
DB2 Connect provides an infrastructure for connecting applications to data.

The APIs and connectivity infrastructure on their own form a powerful foundation for mainframe database solutions. However, these features alone don't even begin to describe the capabilities that the DB2 Connect product can provide your business..

Resources

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Information management on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=56423
ArticleTitle=An introduction to IBM DB2 Connect: It's more than meets the eye: The basics of DB2 Connect
publish-date=03102005