PI81527: IN WEBSPHERE V8.5.5 ,AFTER A LOST DATABASE CONNECTION, WSGRID HANGS AND JOBS REMAIN IN SUBMITTED STATE

Fixes are available

APAR status

Closed as program error.

Error description

In WebSphere v8.5.5 when a database connection is lost and if
the jobs are submitted via WSGrid,some of the jobs get executed
successfully but some remain in submitted state and WSGrid
hangs. On ripple starting the endpoint cluster, the stuck jobs
run and complete successfully.
The following NullPointerException is reported via an FFDC:
FFDC Exception:java.lang.NullPointerException
SourceId:com.ibm.ws.
asynchbeans.J2EEContext.run ProbeId:394 Reporter:com.ibm.ws.
asynchbeans.J2EEContext@3d62bfb5
java.lang.NullPointerException
 at com.ibm.ws.batch.GridDispatcher._issueRuntimeException(
 GridDispatcher.java:153)
 at com.ibm.ws.batch.GridDispatcher.dispatch(GridDispatcher.
 java:143)
 at com.ibm.ws.gridcontainer.impl.PortableGridKernelImpl._
 dispatchGridWork(PortableGridKernelImpl.java:334)
 at com.ibm.ws.gridcontainer.impl.PortableGridKernelImpl._
 dispatchWork(PortableGridKernelImpl.java:314)
 at com.ibm.ws.gridcontainer.impl.PortableGridKernelImpl.
 scheduleJob(PortableGridKernelImpl.java:83)
 at com.ibm.ws.gridcontainer.services.impl.GridWork.run
 (PGCControllerImpl.java:657)
 at com.ibm.ws.asynchbeans.J2EEContext$RunProxy.run
 (J2EEContext.java:271)
 at java.security.AccessController.doPrivileged
 (AccessController.java:399)
 at com.ibm.ws.asynchbeans.J2EEContext.run(J2EEContext.
 java:797)
 at com.ibm.ws.asynchbeans.WorkWithExecutionContextImpl.
 go(WorkWithExecutionContextImpl.java:222)
 at com.ibm.ws.asynchbeans.ABWorkItemImpl.run(ABWorkItem
 Impl.java:206)
 at java.lang.Thread.run(Thread.java:790)

Local fix

```
Ripple start the endpoint cluster
```

Problem summary

****************************************************************
* USERS AFFECTED:  All users of IBM WebSphere Application      *
*                  Server Java Batch                           *
****************************************************************
* PROBLEM DESCRIPTION: With the loss of a job scheduler        *
*                      database and while Java Batch jobs are  *
*                      dispatched to a Java Batch endpoint,    *
*                      The Java Batch job state remains        *
*                      stuck in SUBMITTED state.               *
****************************************************************
* RECOMMENDATION:                                              *
****************************************************************
With the loss of a job scheduler database
occuring while Java Batch jobs are
dispatched to a Java Batch endpoint, the Java Batch runtime
causes the Java Batch jobs running at that time to fail and
transitions the failed jobs to RESTARTABLE state. With the job
scheduler database restarted and during the course of
transitioning the jobs to RESTARTABLE state, the Java Batch
runtime encounters stale database connection objects left over
from the pool associated with the failed job scheduler
database. This causes a NullPointerException in
the Java Batch runtime and leaves the jobs stuck in the
SUBMITTED state, unable to be restarted.

Problem conclusion

The Java Batch runtime code was updated to anticipate a
NullPointerException in this case and to take the appropriate
action to ensure jobs in flight are transitioned to
RESTARTABLE state.

The fix for this APAR is currently targeted for inclusion in
fix packs 8.5.5.13 and 9.0.0.6.  Please refer to the
Recommended Updates page for delivery information:
http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980

Temporary fix

Comments

APAR Information

APAR number
PI81527
Reported component name
WEBS APP SERV N
Reported component ID
5724H8800
Reported release
850
Status
CLOSED PER
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
2017-05-12
Closed date
2017-10-05
Last modified date
2017-10-05

APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name
WEBS APP SERV N
Fixed component ID
5724H8800

Applicable component levels

R850 PSY
UP
R900 PSY
UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
03 May 2022

Tips

PI81527: IN WEBSPHERE V8.5.5 ,AFTER A LOST DATABASE CONNECTION, WSGRID HANGS AND JOBS REMAIN IN SUBMITTED STATE

Fixes are available

Subscribe

APAR status

Closed as program error.

Error description

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

APAR Information

APAR number

Reported component name

Reported component ID

Reported release

Status

PE

HIPER

Special Attention

Submitted date

Closed date

Last modified date

APAR is sysrouted FROM one or more of the following:

APAR is sysrouted TO one or more of the following:

Fix information

Fixed component name

Fixed component ID

Applicable component levels

R850 PSY

R900 PSY

Document Information

Share your feedback

Need support?