Modernized Java-based batch processing in WebSphere Application Server, Part 3: Enterprise batch processing

Previous installments of this series introduced the Modern Batch feature of IBM® WebSphere® Application Server and described the compute intensive and transaction batch programming models. The latest release of WebSphere Application Server, Version 8.5, builds on these Modern Batch features to provide a comprehensive batch platform by including all the features that had previously been part of IBM WebSphere Compute Grid. These enhancements enable you to build an enterprise batch infrastructure without the need for adding any additional software. This article details advanced WebSphere Application Server batch features, including parallel processing, skip record processing, and retry step processing, that provide resilient and optimized enterprise batch solutions. This content is part of the IBM WebSphere Developer Technical Journal.

Shishir Narain (nshishir@in.ibm.com), IT Architect, IBM

Shishir Narain is Open Group certified Master IT Specialist with mature skills in IBM middleware products. He works in IBM Lab Services for WebSphere at India Software Labs, Gurgaon. He has 14 years of experience developing solutions for multiple clients. He has a Master of Technology degree from Indian Institute of Technology, Kanpur.


developerWorks Contributing author
        level

Shashi B. Pahwa, IT Architect

Shashi Pahwa is a Sun Certified J2EE Architect with 11 years of experience using IBM Rational Software Architect, UML 2, BIRT, plugin development. He holds a Master of Computer Science degree from Maharishi Dayanand University, Rohtak India.



Deepak Sumani (deepak.sumani@in.ibm.com), IT Architect, IBM China

Deepak Sumani is working for IBM Global Services, India as an IT Architect. He has over 12+ years of IT experience in designing and developing SOMA based solutions. His area of expertise is primarily on SOA and Java/J2EE-based technologies using Rational and WebSphere products, specializing in WebSphere BPM Solutions and Practice.



24 October 2012

Also available in Chinese

Introduction

IBM WebSphere Application Server V8 added a new container for batch processing that provides an environment for the execution of Java™ EE based-batch applications. This new batch container offers comprehensive features that make it suitable as an enterprise batch infrastructure provider. The Feature Pack for Modern Batch, available with WebSphere Application Server V7, began as a good starting point by providing consistent programming models and tools, but did not contain advanced batch capabilities required in enterprise settings. WebSphere Application Server V8.5 fills these gaps by offering:

  • Parallel processing of the batch load across the enterprise infrastructure. An enterprise application server environment consists of multiple servers working together to provide a high performance and highly available infrastructure. Simple batch processing runs on a single server, so it is unable to optimally utilize the available application server infrastructure. WebSphere Application Server’s Modern Batch feature provides a parallel job manager that enables container-managed parallel processing, thus enabling the batch job to be divided across the various servers for efficiency. Running the job as “one job” from a batch perspective provides operational control, while splitting the task internally across the servers ensures optimal utilization of the hardware resources and also reduces the time taken to complete the task.
  • Skip record processing enables a batch program to omit “bad” records during processing. This feature is essential in processing; bad data can occur in real scenarios, and failing the entire batch due to a few bad records can have a chain reaction across the batch program schedule. Failing a batch also forces manual intervention before any further processing can be done. In most scenarios, the more desirable action is for the application to log the incident and continue processing the “good” records after the “bad” ones have been skipped.
  • Retry step processing can be used to re-run a failed job step within the same job run. This is useful to handle transient exceptions during the process job step. In case an exception occurs while executing a job step while retry step processing is enabled, the job step is ended and all the resources are returned to the state when the job step was started. The job step is then tried again.
  • Integration support with a variety of enterprise schedulers, such as IBM Tivoli® Workload Scheduler.
  • COBOL support, which enables you to reuse COBOL modules in WebSphere batch applications.

This article discusses some of these key capabilities, which enable batch developers to focus on tackling business problems rather than having to build a custom batch infrastructure.


Technical concepts

Before jumping into the sample implementation, there are some details you need to know about these key capabilities.

Skip record processing

The skip record processing feature can be used while reading or writing. It is defined for a batch data stream and requires these additional constructs:

  • xJCL properties
    • com.ibm.batch.bds.skip.count: A non-zero value for this property tells the batch container to skip “bad” records. The count specifies the number of records that can be skipped, after which the processing is interrupted.
    • com.ibm.batch.bds.skip.include.exception.class.<n>: The “bad” records generate exceptions. This property is used to define the exception classes that can be skipped.
    • com.ibm.batch.bds.skip.exclude.exception.class.<n>: Used to define the name of exception classes that cannot be skipped. Be aware that this property and the previous one, com.ibm.batch.bds.skip.include.exception.class.<n>, are mutually exclusive and should not be defined together.
  • Skip listeners: These are used to trigger actions every time a record is skipped.
  • Record metrics: The number of skipped records and the record processing rate is kept by the batch framework, and it can be used to report the metrics for the batch job execution.

In the exercise presented here, you will implement skip record processing while reading the data from the input file. Figure 1 shows a simplified flowchart on how skip record processing happens while reading.

Figure 1. Skip record processing flow (while reading)
Figure 1. Skip record processing flow (while reading)

See Resources for more about skip record processing.

Retry step processing

The retry step processing feature is defined for a job step that has its own retry-step policy configuration. You enable retry-step processing by specifying a non-zero value for the com.ibm.batch.step.retry.count job step property in the xJCL.

Retry step processing requires these additional constructs:

  • xJCL properties
    • com.ibm.batch.step.retry.count: A non-zero value for this property tells the batch container to retry the job step in case of failure.
    • com.ibm.batch.step.retry.delay.time: This value specifies the number of milliseconds to wait before trying the step again.
    • com.ibm.batch.step.retry.include.exception.class.<n>: This property is used to define the exceptions that can be tried again when a step fails.
    • com.ibm.batch.step.retry.exclude.exception.class.<n>: Used to define what exceptions cannot be tried again when a step fails. Be aware that this property and previous one, com.ibm.batch.step.retry.include.exception.class.<n>, are mutually exclusive and should not be defined together.
  • Retry listeners

    The retry listener can be registered with the JobStepContext method to listen for exceptions that should be tried again. The retry listener receives control whenever an exception that can be tried again occurs and the step is tried again.

Parallel job management

The batch container determines that the job is to be run in parallel mode through the run property of the xJCL. The parallel job manager (PJM) component of the batch container is used for creating and managing the subordinate jobs. PJM uses these APIs:

  • Parameterizer: Used to divide the job into subordinate jobs. This enables you to add a new property to the subordinate job xJCL that can be used to partition the workload. This ensures that the subordinate jobs work on different records.
  • LogicalTX.Synchronization: Called to control the job during the lifecycle of the parallel job.
  • SubJobCollector: Collects the state information about the individual subordinate job during checkpoints.
  • SubJobAnalyzer: Used to determine the overall return code of the job based on the data from SubJobCollector.

The rest of the programming model is similar to transaction job programming model, and uses the input reader, batch processor, and the output writer.

Figure 2 captures how the job is run in parallel across servers using the parallel job manager and the scheduler.

Figure 2. Parallel job management with WebSphere Application Server
Figure 2. Parallel job management with WebSphere Application Server

See Resources for learn more about parallel job management.

With this background, you will create a batch job that enables skip record processing and parallel job processing.


Sample business scenario

Any realistic scenario might involve the processing of record data, but to maintain focus on the Modern Batch feature the simplified example presented here does not deal with any data processing.

Let’s consider a simple case in which you need a batch program that scans record data in plain text from a file, and then inserts these records into a database.

Your batch program should:

  • Allow for taking intermediate checkpoints.
  • Split the processing into smaller parts and run on multiple servers concurrently, so that it utilizes the available hardware and minimizes the batch run time.
  • Skip “non-viable” records without affecting the processing, both while reading and writing.
  • Retry the processing, in case you get a transient error.

The sample input file, included with this article, has records containing transaction ID, customer ID, and the amount paid by the customer for an item or service purchased. All the data values are integers separated by ”|.” For the purpose of this example, the batch program should accommodate these rules:

  • If the input transaction ID is not an integer, or if it is more than 20000, the record is to be skipped.
  • If more than five bad records are encountered, the batch program should abort.
  • The output data has the restriction that the length of the customer ID cannot be greater than 30 characters.
  • the program should check for the existence of a lock file. If the file is not present, the job processing should be retried twice, after 30 seconds.

You will develop a batch program that meets all the above objectives.

This exercise assumes the following prerequisite software is installed and available:

  • IBM WebSphere Application Server V8.5 (or later) Network Deployment
  • IBM Rational® Application Developer 8.5 (or later)
  • IBM DB2® 9 (or later)

Follow these steps to create the environment for the exercise:

  1. Create a database called LRSCHED for the job scheduler and the grid endpoint on DB2. Create the data source for this database on the cell level from the admin console of the deployment manager. (See the WebSphere Application Server V8.5 Information Center and search for the topic "Creating the job scheduler and grid endpoint database" to learn more.)
  2. Use the WebSphere Customization Toolbox to create a profile for the deployment manager and the job scheduler. On the Environment Selection panel, select the Cell (Deployment manager and a federated application server) option to create a deployment manager and application server, server1, which will serve as the job scheduler. (To keep the configuration simple, do not enable security on any of the servers.)
  3. Start the deployment manager and the node agent of the application server, server1.
  4. Use the WebSphere Customization Toolbox to create a profile for the grid endpoints. On the Environment Selection screen, select the Custom option and provide details of the deployment manager created in step 2 to ensure federation of this profile.
  5. Create a cluster in the custom profile and then add the two servers, server2 and server3, to this cluster. Start the node agent for this profile.
  6. Configure the job scheduler on server1 with a DB2 data source. Search the WebSphere Application Server V8.5 Information Center for the topic "Configuring the job scheduler."
  7. Configure the grid points so that they use the DB2 data source. Search the WebSphere Application Server V8.5 Information Center for the topic "Configuring WebSphere grid endpoints."
  8. Start server1, server2, and server3 from the deployment manager admin console.
  9. Create the database PAYMENTS, which will be used to store the payment data from the file. The command can be referred from the create_database file, available in the download materials.

The above configurations will result in the layout shown in Figure 3.

Figure 3. Infrastructure layout used for sample
Figure 3. Infrastructure layout used for sample

Building the batch program

As you did in Part 2 of this series, you will use Rational Application Developer to create a batch project and add a batch job to it:

  1. Use the Create a new batch project wizard to develop the sample job. As you proceed through the wizard steps, enter or select the values for fields and propoerties listed below on the corresponding dialogs.
    1. Enter information about the new batch job, as shown in Figure 4:
      • Batch Project = dwSamplePartThree
      • Job Name = SkipAndParallelJob
      Figure 4. Creating the batch job
      Figure 4. Creating the batch job
      Click Next.
    2. Enter job step details, as shown in Figure 5:
      • Name = SkipAndParallelStep
      • Pattern = Custom
      • Implementation Class = com.ibm.dw.payment.batch.processor.PaymentJobStep.
      • Checkpoint Algorithm = chkpt
        Pattern = Record based
        TransactionTimeOut=120
        recordcount = ${checkpoint}
      • Result Algorithm = jobsum
        Pattern = Job Sum

      Add these properties:

      • Property=BATCHRECORDPROCESSOR, value=com.ibm.dw.payment.batch.processer.PaymentProcessor
      • Property=max_record_count, value=${numberRecords}
      • Property=EnablePerformanceMeasurement, value=true
      • Property=debug, value=${debugEnabled}
      Figure 5. Creating the batch job step
      Figure 5. Creating the batch job step
      Click Next.
    3. Provide input data details, as shown in Figure 6:
      • Name = inputStream
      • Select Pattern = Text File Reader
      • Implementation Class = com.ibm.dw.payment.batch.bds.PaymentFileReader
      • FILENAME=${inputPaymentFile
      Figure 6. Providing details for input
      Figure 6. Providing details for input
      Click Next.
    4. Provide output details, as shown in Figure 7:
      • Name = outputStream
      • Select Pattern = JDBC Writer
      • Implementation Class = com.ibm.dw.payment.batch.bds.PaymentDBWriter

      Add these optional properties:

      • Property=ds_jndi_name, value=jdbc/dw
      • Property=debug, value=${debugEnabled}
      • Property=batch_interval, value=${numberRecords}
      Figure 7. Providing details for output
      Figure 7. Providing details for output
      Click Finish to complete the batch job creation using wizard.
    5. Next, you need to add a few substitution variables. Open the xJCL in the xJCL editor and add these substitution properties, as shown in Figure 8:
      • Property= inputPaymentFile, value= C:/batch/data/InputData.txt (or the location where you have kept the data file)
      • Property=numberRecord, value=20
      • Property=checkpoint value=50
      • Property=debugEnabled, value=false
      Figure 8. Adding substitution variables
      Figure 8. Adding substitution variables
    6. Open the xJCL in the xJCL editor and add these Substitution Properties:
      • Property= inputPaymentFile, value= C:/batch/data/InputData.txt (or the location where you have kept the data file)
      • Property=numberRecord, value=20
      • Property=checkpoint value=50
      • Property=debugEnabled, value=false
  2. To enable skip record processing while reading the input file, add the lines in Listing 1 to the input batch data stream sections of the xJCL. Add the code after the line in the file that is shown here in bold type.
    Listing 1
    <prop name="FILENAME" value="${inputPaymentFile}"/>
    <prop name="com.ibm.batch.bds.skip.count" value="5" />
    <prop name="com.ibm.batch.bds.skip.include.exception.class.1" 
    	value="java.lang.NumberFormatException" />
    <prop name="com.ibm.batch.bds.skip.include.exception.class.2" 
    value="com.ibm.dw.payment.batch.error.InvalidDataException" />

    If one of the listed exception classes occurs, five such exceptions will be discarded and normal processing will continue. After five skips, the processing stops, signaling that the batch has exceeded the maximum number of allowed skips. The NumberFormatException is emitted when the transaction ID is not an integer. To enforce a custom validation, you have to throw a runtime exception. To do that, add an InvalidDataException, which occurs when the transaction ID is greater than 20,000.

    The PaymentSkip Listener class com.ibm.dw.payment.batch.process.PaymentSkipListener simply logs the skip event in the log file.

  3. To enable skip record processing while writing the output to the database, add the lines in Listing 2 to the output batch data stream sections of the xJCL. Add the code after the line in the file that is shown here in bold type.
    Listing 2
    <prop name="ds_jndi_name" value="jdbc/dw"/>
    <prop name="debug" value="${debugEnabled}"/>
    <prop name="com.ibm.batch.bds.skip.count" value="2" />
    <prop name="com.ibm.batch.bds.skip.include.exception.class.1" 
    value="com.ibm.dw.payment.batch.error.InvalidDataException" />

    If the length of the customer ID is more than 30, you have to throw a runtime exception. The PaymentSkip Listener class com.ibm.dw.payment.batch.process.PaymentSkipListener simply logs the skip event in the log file.

  4. Implementing parallel job management requires a few more changes. These changes indicate to the batch container that the job needs to be run in parallel mode, and provide the classes that the parallel job manager requires. Add the code shown in Listing 3 below the job step class name.
    Listing 3
    <classname>com.ibm.dw.payment.batch.processer.PaymentJobStep</classname>
    <run instances="multiple" jvm="multiple">
    <props>
    		 <prop name="com.ibm.websphere.batch.parallel.parameterizer"
    		 value="com.ibm.dw.payment.batch.parallel.PaymentParameterizer"/>
    		 <prop name="com.ibm.websphere.batch.parallel.synchronization"
    		 value="com.ibm.dw.payment.batch.parallel.PaymentTXSynchronization"/>
    		 <prop name="com.ibm.websphere.batch.parallel.subjobanalyzer"
    		 value="com.ibm.dw.payment.batch.parallel.PaymentSubJobAnalyzer"/>
    		 <prop name="com.ibm.websphere.batch.parallel.subjobcollector"
    		 value="com.ibm.dw.payment.batch.parallel.PaymentSubJobCollector"/>
    		 <prop name="com.ibm.wsspi.batch.parallel.subjob.name"
    		 value="PaymentSubJob" />
    		 <prop name="parallel.jobcount" value="2" />
    </props>
    </run>
  5. Add the three step level properties shown in Listing 4 to the record processor step of the batch job. These properties are modified at job submission time by the ParallelJobManager.
    Listing 4
    <props>
    <prop name="BATCHRECORDPROCESSOR" 
    value="com.ibm.dw.payment.batch.processer.PaymentProcessor"/>
    	...
    	...
    	<prop name="com.ibm.wsspi.batch.parallel.jobname" 
    		value="${parallel.jobname}"/>
    	<prop name="com.ibm.wsspi.batch.parallel.logicalTXID" 
    		value="${logicalTXID}"/>
    	<prop name="com.ibm.wsspi.batch.parallel.jobmanager" 
    		value="${parallel.jobmanager}"/>
    	<prop name="job" value="${JOB}"/>
        
       </props>
       <results-ref name="jobsum"/>
  6. To add the retry capability, add the code shown in Listing 5. This code checks the existence of a lock file and retries a record twice, after 30 seconds.
    Listing 5
    <prop name="job" value="${JOB}"/>
    <prop name="filepath" value="C:/Shishir/check.txt"/>
    		<prop name="com.ibm.batch.step.retry.count" value="2" />
    		<prop name="com.ibm.batch.step.retry.delay.time" value="30000" />
    		<prop name="com.ibm.batch.step.retry.include.exception.class.1"
    		value="com.ibm.dw.payment.batch.error.MissingFileException" />
  7. Replace the implementation classes with the ones included with this article for download. While most of the code is straightforward, let’s take a closer look at these sections:
    • PaymentParameterizer.java: (Listing 6) Based on the server count, the payment parameterizer helps in portioning the data so that the different servers can process the records in parallel. For this example, you have two end point servers and so the job is split in two. An appropriate strategy should be used to parameterize depending upon the actual scenario.
      Listing 6
      Properties newprops [] = new Properties[jobcount];
      for ( int i=0; i<jobcount; i++) {
      	newprops[i] = new Properties();
      		if(i == 0 ){
      			newprops[i].put("JOB", "1");
      		}else{
      			newprops[i].put("JOB", "2");
      	}
      }
      parms.setSubJobProperties(newprops);
      return parms;
    • PaymentJobStep.java: (Listing 7) The processJobStep method ensures that servers process the records based on the transaction ID being even or odd, this affecting parallel processing.
      Listing 7
      if (jobNumber == 2) {
      	if (Integer.parseInt(paymentRecord.get_transactionId()) % 2 == 0) {
      		paymentRecord.set_jobName(getJobID());
      		_outputDBBDS.write(paymentRecord);
      		logger.fine("read record from inputPaymentBDS: "
      		+ paymentRecord + " job step id"
      		+ getJobID());
      		_recordCountJob2++;
      		}
      	}
      
      if (jobNumber == 1) {
      	if (Integer.parseInt(paymentRecord.get_transactionId()) % 2 != 0) {
      		paymentRecord.set_jobName(getJobID());
      		_outputDBBDS.write(paymentRecord);
      		logger.fine("read record from inputPaymentBDS: "
      		+ paymentRecord + " job step id"
      		+ getJobID());
      		_recordCountJob1++;
      	}
      }
  8. Run the create_table.sql script (available in the download materials) against the database to create the tables where the payment data will be kept.

This completes the development of the sample batch job.


Running the sample

To run the sample, submit the job control xJCL file to the job scheduler using the job management console.

Testing the retry capability

To verify the retry capability, ensure that the file noted in <prop name="filepath" value="C:/Shishir/check.txt"/> is not present initially when the job is run, but then replace it perhaps 10 seconds after the job is started. The job will fail initially because the file will be absent, and will go to retry mode because you have enabled retry capability. The job will continue normally once it finds the file after 30 seconds. You should notice that the batch was retried after 30 seconds in the log (Listing 8).

Listing 8
System.out: [09/12/12 21:04:22:116 EDT] com.ibm.dw.payment.batch.processor.
PaymentJobStep:caught exception in processJobStep; com.ibm.dw.payment.batch.
error.MissingFileException: file not found
file not found
CWLRB2280E: [Long Running Job Container step execution failed]
[Job SkipAndParallelJob:00237:00239] [Step SkipAndParallelStep]:
com.ibm.dw.payment.batch.error.MissingFileException: file not found
CWLRB2460I: [09/12/12 21:04:22:118 EDT]  [09/12/12 21:04:22:118 EDT]
Job [SkipAndParallelJob:00237:00239] Step [SkipAndParallelStep] is in step breakdown.
CWLRB5606I: [09/12/12 21:04:22:118 EDT] Destroying job step: SkipAndParallelStep
.................................................
CWLRB5853I: [09/12/12 21:04:52:186 EDT] Retry started for job
SkipAndParallelJob:00237:00239 step SkipAndParallelStep due to error
com.ibm.dw.payment.batch.error.MissingFileException: file not found
CWLRB1860I: [09/12/12 21:04:52:186 EDT] Dispatching Job
[SkipAndParallelJob:00237:00239] Step [SkipAndParallelStep]
CWLRB2420I: [09/12/12 21:04:52:203 EDT]  [09/12/12 21:04:52:203 EDT]
Job [SkipAndParallelJob:00237:00239] Step [SkipAndParallelStep] is in step setup.

Testing the skip record capability

The input data file has 10000 records and contains the following errors:

  • The customer ID in record 9998 is more than 30 characters.
  • The transaction ID for the 9999th record is not a numeric.
  • The transaction ID for the 10000th record is greater than 20000.

You should notice that these records were skipped (Listing 9) and the same should be reported in the logs (Listing 10).

Listing 9
System.out: [09/12/12 21:05:03:342 EDT] record is9998|99099980000000000000000000000000|788
System.out: [09/12/12 21:05:03:342 EDT] com.ibm.dw.payment.batch.processor.
PaymentJobStep:input :processRecord
System.out: [09/12/12 21:05:03:342 EDT] com.ibm.dw.payment.batch.processor.
PaymentProcessor:record is :PaymentRecord:9998|99099980000000000000000000000000|788|null
System.err: [09/12/12 21:05:03:342 EDT] com.ibm.dw.payment.batch.error.
InvalidDataException: Cust Id length is greater than 30System.err: [09/12/12
21:05:03:342 EDT]
System.err: [09/12/12 21:05:03:342 EDT] 	at com.ibm.dw.payment.batch.bds.
PaymentDBWriter.writeRecord(PaymentDBWriter.java:55)System.err: [09/12/12
21:05:03:342 EDT]
.......................
CWLRB5852I: [09/12/12 21:05:03:343 EDT] Record skipped by batch data stream
outputStream in job SkipAndParallelJob:00237:00239 step SkipAndParallelStep due to
error com.ibm.dw.payment.batch.error.InvalidDataException: Cust Id length is greater
than 30
System.out: [09/12/12 21:05:03:343 EDT] com.ibm.dw.payment.batch.processor.
PaymentJobStep:read record from inputPaymentBDS: PaymentRecord:9998|
99099980000000000000000000000000|788|SkipAndParallelJob:00237:00239 job step
idSkipAndParallelJob:00237:00239
System.out: [09/12/12 21:05:03:343 EDT] com.ibm.dw.payment.batch.processor.
PaymentJobStep:RecordMetric data for inputPayment BDS.  Number of skipped
records = 0. Records/second = 7475
System.out: [09/12/12 21:05:03:343 EDT] record is9999A|9909999|842
CWLRB5852I: [09/12/12 21:05:03:343 EDT] Record skipped by batch data stream
inputStream in job SkipAndParallelJob:00237:00239 step SkipAndParallelStep due to error
java.lang.NumberFormatException: For input string: "9999A"
System.out: [09/12/12 21:05:03:344 EDT] record is30000|99010000|542
CWLRB5852I: [09/12/12 21:05:03:344 EDT] Record skipped by batch data stream
inputStream in job SkipAndParallelJob:00237:00239 step SkipAndParallelStep due
to error com.ibm.dw.payment.batch.error.InvalidDataException: trans id is greater
than 20000
.............................
Listing 10
CWLRB5854I: [09/12/12 21:05:03:420 EDT] Job Step [SkipAndParallelJob:00237:00239,
SkipAndParallelStep]: Metric = clock  Value = 00:00:11:098
CWLRB5854I: [09/12/12 21:05:03:421 EDT] Job Step [SkipAndParallelJob:00237:00239,
SkipAndParallelStep]: Metric = retry  Value = 1
CWLRB5844I: [09/12/12 21:05:03:421 EDT] Job Step Batch Data Stream
[SkipAndParallelJob:00237:00239,SkipAndParallelStep,outputStream]: Metric = skip
Value = 1
CWLRB5844I: [09/12/12 21:05:03:423 EDT] Job Step Batch Data Stream
[SkipAndParallelJob:00237:00239,SkipAndParallelStep,outputStream]: Metric = rps
Value = 8,186
CWLRB5844I: [09/12/12 21:05:03:424 EDT] Job Step Batch Data Stream
[SkipAndParallelJob:00237:00239,SkipAndParallelStep,inputStream]: Metric = skip
Value = 2
CWLRB5844I: [09/12/12 21:05:03:424 EDT] Job Step Batch Data Stream
[SkipAndParallelJob:00237:00239,SkipAndParallelStep,inputStream]: Metric = rps
Value = 7,476

The processed records should be inserted in the database. If you query the database, you should notice that all the records except the three “bad” records were inserted in the database. Figure 9 shows the number of records in the table.

Figure 9. Verifying records in database
Figure 4. Verifying records in database

Testing the parallel job capability

Upon submission, you should notice that two subordinate jobs were created and processed in parallel (Figure 10).

Figure 10. Subordinate jobs
Figure 10. Subordinate jobs

Figure 4 also shows the count of the records processed by each parallel subordinate job.


Conclusion

The Modern Batch feature of WebSphere Application Server provides a robust enterprise batch programming model that enables you to develop batch programs with minimum effort. Because Modern Batch is a part of WebSphere Application Server, the batch infrastructure can be built using the same platform as all other applications, thus enabling the re-use of applications, skills, and tools. This article described advanced features of Modern Batch, such as parallel processing and skip record processing, and developed a sample application using the same. The next installment of this series, Part 4, will discuss grouping batch jobs, integration with enterprise schedulers, and other advanced features.


Acknowledgements

The authors thank Sajan Sankaran for reviewing this article and providing invaluable inputs. The authors would also like to acknowledge that the earlier works of Chris Vignola, Snehal Antani, and Don Bagwell through presentations, forums, and articles helped shape their understanding of Modern Batch.


Download

DescriptionNameSize
Sample application project files1210_narain_attachment.zip122 KB

Resources

Learn

Get products and technologies

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into WebSphere on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=WebSphere
ArticleID=842311
ArticleTitle=Modernized Java-based batch processing in WebSphere Application Server, Part 3: Enterprise batch processing
publish-date=10242012