Introduction
Batch Processing is an important aspect in business systems and is used in such areas as billing system or report generation, and end-of-day settlement system. With business systems being used worldwide round the clock, batch windows are getting narrower, and this makes an efficient batch processing system a real necessity. WebSphere Extended Deployment Compute Grid (hereafter referred to as Compute Grid) is a complete out-of-the-box batch processing platform, providing an efficient, reliable, scalable, highly available, and secure batch execution environment.
This article is based upon WebSphere Compute Grid V8. We use the batch job development feature of Rational Application Developer V8 to construct a simple transactional batch application. We then modify it to include the parallel job manager facility. It describes in detail, the step by step procedure for developing a batch application from scratch and how parallelism can be achieved in a batch job using the Parallel Job Manager (hereafter called PJM) facility provided by Compute Grid.
The sample batch application, named EmployeeBatchV8, takes employee
data from the EMPLOYEE table, does some processing, and then inserts
the updated information into the EMPLOYEEOUTPUT table. We will have
about 10,000 employee input records belonging employees living in
different states in the United States, with state abbreviations
ranging from ranging from AL to WY. Using the Parallel Job Manager
Facility, the main job is split up into different subordinate jobs
(AL-MO and MT-WY) and separately processed. We override the
Parameterizer system programming interface
(SPI) to provide an independent set of inputs to each subordinate job
so that they can run parallel in different grid endpoints(GEE) in a
clustered environment. See Figure 1.
Figure 1. Application Overview
In this tutorial, you'll learn how to
- Develop Batch Applications using Rational Application Developer 8 and the Compute Grid APIs
- Use the Parallel Job Manager facility of ComputeGrid
- Deploy the application on a WebSphere Application Server Network Deployment cluster and monitor the jobs
- Use WSBatchPackager utility to create a batch application from POJO classes
You must be familiar with the developing Java applications using an Eclipse-based IDE and Application Deployment in WebSphere Application Server Network Deployment (hereafter Application Server).
To run the examples in this tutorial, you need WebSphere Application Server V7.0.0.l7 or above (preferably ND) and WebSphere Compute Grid 8.0.0.1 installed in any supported environment. See Figure 2. The setup for the environment used for this tutorial is:
- Windows XP Machine
- WebSphere Application Server 7.0.0.19 ,Compute Grid 8.0.0.1 Installed
- Network Deployment Manager profile created.
- profile name: Dmgr01
- node name: ${shorthostname}CellManager01
- server: dmgr
- Managed node profile created. (profile name: AppSrv01)
- Managed node federated into Network Deployment(ND) Cell
- Cluster created (cluster name: CGCluster) with 2 servers (server1,server2)
- One another server named SchedulerClone created in the same cell to act as scheduler
- DB2 UDB V9.7 installed and the Employee database created. Run the ddl file provided as a download with this tutorial to create the Employee and EMPLOYEEOUTPUT tables
- Migrated the Compute Grid data sources from default derby database to DB2
- XA DataSource configured in the Application Server console pointing to the Employee Database. (JNDI name: jdbc/employeedbxa)
- Rational Application Developer 8.0.2 or latest installed with the Compute Grid Tools for Modern Batch feature turned on
Figure 2. Sample infrastructure diagram





