J2EE architecture emphasizes centralizing core business logic into reusable applications. These applications become easier to maintain when centralized and deployed to middleware such as WebSphere. Modern middleware addresses many significant problems that had plagued traditonal IT infrastructures- ensuring that the application is highly available, scalable, and secure, and that data integrity is robust. One key feature that J2EE application servers do not directly address however is the ability to access business logic asynchronously in a batch processing environment
Asynchronous batch processing is an integral part of any complete IT solution and has existed on the mainframe for decades. Banking solutions use batch exhaustively, for example, to assess interest on accounts or compute credit ratings. As J2EE and open standards become more pervasive in the marketplace, business applications will migrate from native technology to these open standards and run on middleware such as WebSphere. In this evolution, businesses will lose the ability to efficiently access those business applications asynchronously. WebSphere Extended Deployment (hereafter called Extended Deployment) delivers Business Grid technology to help address this problem. Business Grid is the overall execution environment for two types of workloads- Compute Intensive and Batch. While Business Grid encompasses both workload types, this article focuses on batch processing and how it is managed by the Business Grid.
Batch jobs consist of units-of-work that must be executed on one or more records. These records are stored in some type of file system or database and are fed to this unit-of-work in a loop. Distinct tasks of a batch application can be divided into batch steps. The entire batch job is complete when each batch step has been processed. The Business Grid is a long-running execution environment that allows you to access business applications in a batch-like, asynchronous manner. It defines a J2EE-based programming model to which you can write batch applications. We discuss the details of the programming model, but generally you must implement a
processJobStep() method in your Java implementation class. This
processJobStep() method is synonymous with the unit-of-work previously described. The Extended Deployment long-running execution environment calls this method repeatedly in a loop as shown in Figure 1.
For more detailed information on batch jobs, see the WebSphere Extended Deployment Information Center.
Figure 1. The Execution environment invokes the
processJobStep()defined on the batch step in a batch loop
Two runtime components make up the Business Grid: one Long Running Scheduler (LRS) and one or more Long-Running Execution Environments (LREE), as shown in Figures 2a and 2b. The LRS is primarily a dispatcher:
- Web service or IIOP request submits batch jobs to the LRS using metadata described in an XML dialect known as xJCL
- The LRS uses placement and workload algorithms to determine which LREE should execute this job
- The LRS dispatches the job to that LREE
- The LREE executes the job
- The LRS is notified of state changes to the job itself
- State changes are persisted in database tables specific to the LRS
Figure 2a. Overview of WebSphere Extended Deployment Business Grid's LRS and LREE components
Figure 2b. End-to-end view of how batch jobs are dispatched to the Business Grid
A typical batch application must address the following questions:
- What specific steps must be executed to complete this batch job?
- What is the source of the input data?
- What is the destination of the output data?
- How frequently should the state of the overall job be saved (checkpointed) and therefore be restorable?
- How should the return codes for each job step be handled?
- Is there a conditional flow between the steps in the job?
The Business Grid defines a programming model that specifies how to form the answers to each of the questions. It manages several key pieces of a batch job:
- Positioning and repositioning data streams
- Checkpointing the job at some predefined interval
- Processing the return codes from steps and/or the overall job
- Passing each record to the unit-of-work defined in the application
- Prioritizing the dispatching of jobs based on the service policy applied to them
The Business Grid introduces the concept of a Batch Data Stream (BDS). A BDS is a positionable data source. Some typical BDS types include:
- Service Data Objects (SDO)
- MVS datasets (using jZOS, see http://jzos.com/)
- Relational databases
- Any other positional resources
The batch step processess an input BDS containing the records or pieces of records. The batch step writes the results of that processing to an output BDS. The Business Grid manages the checkpointing of these BDSs so that they can be repositioned at the last checkpoint if the batch job is restarted (after a failure for example).
Listing 1 describes a BDS declaration within the xJCL
Listing 1: A BDS declaration within the xJCL
<batch-data-streams> <bds> <logical-name>myoutput</logical-name> <impl-class>com.ibm.websphere.samples.PostingOutputStream</impl-class> <props> <prop name="FILENAME" value="somefile" /> </props> </bds> </batch-data-streams>
The Business Grid allows for checkpointing (saving the state of the job at some selected interval) batch jobs. This lets you restart the job at its previous checkpoint on any other LREE if necessary. Checkpointing is important in batch processing because it allows you to recover more efficiently from failures, instead of starting the entire job over from the beginning. Business Grid has two predefined checkpoint algorithms: Timer-Based and Record-Based.
- Timer-Based algorithm: checkpoints occur after some specific amount of time has passed, (e.g., every fifteen seconds).
- Record-Based algorithm: checkpoints occur after some specified number of records has been processed (e.g., every one hundred records).
Checkpoint algorithms directly affect the transactional behavior of a batch application. The Business Grid initializes a global transaction at the start of a checkpoint and commits that global transaction once the checkpoint has completed successfully. This allows the runtime to rollback any work that took place within the checkpoint when a failure occurs. Resource consideration, such as database locks, open file-descriptors, and memory must play an integral role in deciding how frequently to checkpoint. Listing 2 shows a checkpoint algorithm within the xJCL.
Listing 2. Checkpoint algorithm declaration within the xJCL
<checkpoint-algorithm name="timebased"> <classname>com.ibm.wsspi.batch.checkpointalgorithms.timebased</classname> <props> <prop name="interval" value="15" /> </props> </checkpoint-algorithm>
The Business Grid also provides callbacks for results algorithms. Results algorithms are points within the application that can process job step and overall job return codes. These results algorithms examine the return codes and can use any J2EE service provided by the middleware to process them. For example, it can place a message on a JMS queue or invoke a web service to indicate failure or success. Listing 3 describes the results algorithm within the xJCL.
Listing 3. Results algorithm declaration within the xJCL
<results-algorithms> <results-algorithm name="jobsum"> <classname>com.ibm.wsspi.batch.resultsalgorithms.jobsum</classname> </results-algorithm> </results-algorithms>
Each component of a batch application is defined by meta data in the xJCL XML file. The Extended Deployment Business Grid Programming Guide describes each element of the xJCL and provides samples on how they should be defined. The metadata structure generally consists of the structure shown in Listing 4.
Listing 4. General description of xJCL
<?xml version="1.0" encoding="UTF-8"?> <job name="PostingsSampleEar" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <jndi-name>ejb/com/ibm/websphere/samples/PostingsJob</jndi-name> <step-scheduling-critria> ....................... </step-scheduling-critria> <checkpoint-algorithm name="timebased"> ............... </checkpoint-algorithm> <results-algorithms> ............... </results-algorithms> <job-step name="Step1"> <jndi-name>ejb/DataCreationBean</jndi-name> <checkpoint-algorithm-ref name="timebased" /> <results-ref name="jobsum" /> <batch-data-streams> <bds> <logical-name>myoutput</logical-name> <impl-class>com.ibm.websphere.samples.PostingOutputStream</impl-class> <props> <prop name="FILENAME" value="c:\temp\batchjoboutput\bjo.zout1.out" /> </props> </bds> </batch-data-streams> </job-step> </job>
Figure 3 shows a sequence diagram for how the LREE drives a batch step. If there are multiple steps in the xJCL, the LREE drives this sequence for each step sequentially. Important: The LREE calls all methods in the diagram below under a global transaction context managed by the LREE.
Figure 3. Sequence diagram for how the LREE drives a batch step
For a larger version of Figure 3, click here.
In this section, we will solve two sample problems using Business Grid technology
Problem: Several types of applications (retirement modeling, student loan forecasting, etc.) access a financial calculator. This financial calculator application is their "kernel" application, which must be available to numerous other banking applications, including asynchronous batch-type applications that execute tasks such as calculating interest and credit scores.
The data that the financial calculator must process asynchronously resides in EBCDIC in an IMS database on the mainframe. This data must remain on the mainframe to take advantage of the robust security, high-availability, and scalability features of z/OS and WebSphere on z/OS. How do we asynchronously process this native data while reusing our financial calculator? Figure 4 shows a visual representation of the problem.
Figure 4. Problem: A financial calculator module needs to be re-used by multiple applications in an asynchronous manner
Solution: Invoke financial calculator business logic from a batch step and use Batch Data Streams to read native data sets with jZOS APIs. This is shown in Figure 5.
Figure 5. Solution: Use an Extended Deployment batch application to re-use core business logic
The steps the batch application in Figure 5 takes are:
- Create two jZOS Batch Data Streams: one for input, one for output.
- Define a batch step bean with a
processJobStep(). This bean contains logic to retrieve a record from the jZOS BDS and also contains logic to invoke the financial calculator under the covers with the retrieved record.
- Repeatedly invoke the
processJobStep()of the batch step bean in a batch loop.
- Upon each iteration of
processJobStep(), the batch step bean:
- Creates a Java Object that represents the data retrieved from the input jZOS BDS and passes that object to the financial calculator for processing.
- Takes the output of the processing and persists the data to the output MVS dataset via the output jZOS BDS.
- Continue iterating until each input record has been processed and its output persisted.
Although you could solve this problem without using the Business Grid, the solution is not as efficient or as elegant. A WebSphere MQ solution is shown in Figure 6.
Figure 6. A WebSphere MQ solution for the financial calculator problem
If you opted to use a WebSphere MQ solution, it would take the following steps:
- From a native COBOL application:
- Dispatch the IMS data to WebSphere using message-driven beans and WebSphere MQ messaging
- Convert each record to ASCII XML and place it on the message queue
- When the record is on the message queue, the WebSphere framework would:
- Notify a message-driven bean within WebSphere
- Retrieve that message from the queue with the bean
- Pass that message on to the business logic
- The business logic would then:
- Convert the XML into a Java Object
- Pass the Java Object to to a kernel bean and peform the financial calculations on it
- Convert the Java Object back to XML
- Push the XML back onto the message queue
- WebSphere would then notify the native application and persist the message back into the IMS database
Why is the Business Grid solution better? It provides:
- Flexibility of adjusting the checkpoint algorithm without having to modify any application code. This in turn affects the transaction scope and resource locking schemes.
- Flexibility of changing data sources by defining new Batch Data Streams and isolating the application changes to specific sections of the code, namely the BDS definitions.
- Eliminating the need to massage the data format via intermediaries. Business Grid is able to parse the raw data via the BDS definition and convert that raw data straight to Java-Object, as opposed to converting data to XML, to ASCII, and so on.
- Ability to recover from the last check-pointed position in case of system failures, e.g., temporary network or database failure
- Ability to temporarily suspend or cancel jobs in case of resource contention; eg: if an unexpected peak in online workload occurs, then Business Grid can suspend work and use the server for online-transaction processing.
- Ability to assign service policies to jobs; eg: give higher priority to platinum customer batch workloads.
- Simplified system administration of the batch environment since Business Grid integrates with WebSphere Admin Console to provide view of batch jobs in system.
- Ability to quickly create a conditional flow of job steps using xJCL.
- Ability to use health management features of WebSphere Extended Deployment to proactively manage failures, such as memory leaks, by enabling automatic restart of jobs after recovering from a server failure.
Problem: Imagine that you are designing a batch application to process purchase orders. The batch application receives purchase orders over a WebSphere MQ queue. The application processes the purchase orders in a batch window at night, when online applications activity is not at peak. To get through a day's batch of purchase orders, the application must be extremely optimized and hence written by highly skilled programmers who are familiar with all the eccentricities of the underlying technologies and platforms on which the application runs. In worst case scenarios, some purchase orders do not make the batch window and customers have to wait another day for the order to be processed, which hurts customer satisfaction. Furthermore, analysis of servers indicates that most machines are not even at 100% utilization during the day, while purchase orders await their turn on the MQ queue.
Solution: Make the purchase order application a WebSphere Extended Deployment batch application and let the Business Grid run the Batch whenever idle servers are available in the WebSphere cell. Figure 7 outlines this solution.
Figure 7. Batch processing of JMS-based purchase orders
The direction of flow in the above diagram is from left to right, starting from the submission of xJCL to LRS and resulting in the LREE Purchase Order batch application being driven by LREE to process records on the MQ queue.
The steps in the scheduler flow are:
- xJCL representing job metadata is stored in the LRS database. LRS has an xJCL Repository feature that allows xJCL to be stored and invoked from the LRS database
- In this scenario, you have two options for triggering a submit request to LRS for the purchase order job:
- Use the LRS recurring jobs feature to submit the purchase order job at certain intervals: for example, submit purchase order job every three hours or at 3 pm every day.
- Write a custom trigger that monitors the queue for purchase orders. The trigger uses the LRS Web-services interface to submit the job when it reaches a certain threshold of purchase orders on the queue: for example, submit a job when at least 10 purchase orders are pending.
- Upon receiving the job request, the LRS finds an idle application server in the WebSphere Extended Deployment cell to which to dispatch the request and asynchronously invokes the LREE on that application server to start the batch job.
- The LREE invokes the batch application flow and sends status to the LRS about the progress of the job.
The steps in the batch application flow are:
- The LREE calls the
processJobStepmethod of the batch step.
- The batch step uses a JMS based input Batch Data Stream (BDS) to read records from the purchase order queue. Since
processJobStepis called under a global transaction, BDS access to the queue is transactional.
- The batch step invokes business logic to process the purchase order(s).
- The batch step uses an output JMS BDS to write a confirmation or completed purchase order object to WebSphere MQ. Since
processJobStepis called under a global transaction, the BDS access to the queue is transactional.
processJobStepmethod returns and the LREE consults the checkpoint policies to determine if its time for a checkpoint. If a checkpoint is taken, the current transaction is committed; hence the JMS provider takes input messages that were processed permanently off of the input queue and commits output messages on the output queue. A new transaction is then started.
processJobStepis called again to process the next record(s).
- This cycle continues until all records on the queue are processed. If a failure occurs or if the job is cancelled during processing, it can be restarted later from the last checkpointed position.
In addition to the advantages listed in the previous scenario, WebSphere Extended Deployment has the following advantages:
- Faster purchase order processing by utilizing long running scheduler application placement capabilities.
- Full utilization of idle servers during day time means better return on investment on hardware.
- Recoverable message processing as JMS is transaction aware and positional BDSs can be processed from saved checkpoints.
- Flexible job scheduling choices: you can use interval-based scheduling features of LRS or write custom triggers to kick off jobs
- Integrated management of an organization's online and batch applications from one product, which leads to cost savings derived from not having to maintain two separate workload infrastructures and two distinct highly specialized skill sets in the organization.
WebSphere Extended Deployment provides a robust infrastructure and modern programming model for recoverable batch processing that let's you re-use existing services of an enterprise in a batch application.
Extended Deployment Business Grid Programming Guide
WebSphere Extended Deployment Information Center.
Snehal is a consultant for IBM Software Services (ISSW), supporting large scale enterprise customers with WebSphere Application Server and WebSphere Extended Deployment projects for distributed and z/OS systems. Snehal is based out of the IBM Poughkeepsie lab and his prior experience includes development for Web services, Application Versioning, Security, EJB Container, and Performance for WebSphere Application Server for z/OS and WebSphere Extended Deployment.