IBM Support

PI81527: IN WEBSPHERE V8.5.5 ,AFTER A LOST DATABASE CONNECTION, WSGRID HANGS AND JOBS REMAIN IN SUBMITTED STATE

Fixes are available

9.0.0.6: WebSphere Application Server traditional V9.0 Fix Pack 6
8.5.5.13: WebSphere Application Server V8.5.5 Fix Pack 13
9.0.0.7: WebSphere Application Server traditional V9.0 Fix Pack 7
9.0.0.8: WebSphere Application Server traditional V9.0 Fix Pack 8
8.5.5.14: WebSphere Application Server V8.5.5 Fix Pack 14
9.0.0.9: WebSphere Application Server traditional V9.0 Fix Pack 9
9.0.0.10: WebSphere Application Server traditional V9.0 Fix Pack 10
8.5.5.15: WebSphere Application Server V8.5.5 Fix Pack 15
9.0.0.11: WebSphere Application Server traditional V9.0 Fix Pack 11
9.0.5.0: WebSphere Application Server traditional Version 9.0.5 Refresh Pack
9.0.5.1: WebSphere Application Server traditional Version 9.0.5 Fix Pack 1
9.0.5.2: WebSphere Application Server traditional Version 9.0.5 Fix Pack 2
8.5.5.17: WebSphere Application Server V8.5.5 Fix Pack 17
9.0.5.3: WebSphere Application Server traditional Version 9.0.5 Fix Pack 3
9.0.5.4: WebSphere Application Server traditional Version 9.0.5 Fix Pack 4
9.0.5.5: WebSphere Application Server traditional Version 9.0.5 Fix Pack 5
WebSphere Application Server traditional 9.0.5.6
9.0.5.7: WebSphere Application Server traditional Version 9.0.5 Fix Pack 7
9.0.5.8: WebSphere Application Server traditional Version 9.0.5.8
8.5.5.20: WebSphere Application Server V8.5.5.20
8.5.5.18: WebSphere Application Server V8.5.5 Fix Pack 18
8.5.5.19: WebSphere Application Server V8.5.5 Fix Pack 19
9.0.5.9: WebSphere Application Server traditional Version 9.0.5.9
9.0.5.10: WebSphere Application Server traditional Version 9.0.5.10
8.5.5.16: WebSphere Application Server V8.5.5 Fix Pack 16
8.5.5.21: WebSphere Application Server V8.5.5.21
9.0.5.11: WebSphere Application Server traditional Version 9.0.5.11

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • In WebSphere v8.5.5 when a database connection is lost and if
    the jobs are submitted via WSGrid,some of the jobs get executed
    successfully but some remain in submitted state and WSGrid
    hangs. On ripple starting the endpoint cluster, the stuck jobs
    run and complete successfully.
    The following NullPointerException is reported via an FFDC:
    FFDC Exception:java.lang.NullPointerException
    SourceId:com.ibm.ws.
    asynchbeans.J2EEContext.run ProbeId:394 Reporter:com.ibm.ws.
    asynchbeans.J2EEContext@3d62bfb5
    java.lang.NullPointerException
     at com.ibm.ws.batch.GridDispatcher._issueRuntimeException(
     GridDispatcher.java:153)
     at com.ibm.ws.batch.GridDispatcher.dispatch(GridDispatcher.
     java:143)
     at com.ibm.ws.gridcontainer.impl.PortableGridKernelImpl._
     dispatchGridWork(PortableGridKernelImpl.java:334)
     at com.ibm.ws.gridcontainer.impl.PortableGridKernelImpl._
     dispatchWork(PortableGridKernelImpl.java:314)
     at com.ibm.ws.gridcontainer.impl.PortableGridKernelImpl.
     scheduleJob(PortableGridKernelImpl.java:83)
     at com.ibm.ws.gridcontainer.services.impl.GridWork.run
     (PGCControllerImpl.java:657)
     at com.ibm.ws.asynchbeans.J2EEContext$RunProxy.run
     (J2EEContext.java:271)
     at java.security.AccessController.doPrivileged
     (AccessController.java:399)
     at com.ibm.ws.asynchbeans.J2EEContext.run(J2EEContext.
     java:797)
     at com.ibm.ws.asynchbeans.WorkWithExecutionContextImpl.
     go(WorkWithExecutionContextImpl.java:222)
     at com.ibm.ws.asynchbeans.ABWorkItemImpl.run(ABWorkItem
     Impl.java:206)
     at java.lang.Thread.run(Thread.java:790)
    

Local fix

  • Ripple start the endpoint cluster
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:  All users of IBM WebSphere Application      *
    *                  Server Java Batch                           *
    ****************************************************************
    * PROBLEM DESCRIPTION: With the loss of a job scheduler        *
    *                      database and while Java Batch jobs are  *
    *                      dispatched to a Java Batch endpoint,    *
    *                      The Java Batch job state remains        *
    *                      stuck in SUBMITTED state.               *
    ****************************************************************
    * RECOMMENDATION:                                              *
    ****************************************************************
    With the loss of a job scheduler database
    occuring while Java Batch jobs are
    dispatched to a Java Batch endpoint, the Java Batch runtime
    causes the Java Batch jobs running at that time to fail and
    transitions the failed jobs to RESTARTABLE state. With the job
    scheduler database restarted and during the course of
    transitioning the jobs to RESTARTABLE state, the Java Batch
    runtime encounters stale database connection objects left over
    from the pool associated with the failed job scheduler
    database. This causes a NullPointerException in
    the Java Batch runtime and leaves the jobs stuck in the
    SUBMITTED state, unable to be restarted.
    

Problem conclusion

  • The Java Batch runtime code was updated to anticipate a
    NullPointerException in this case and to take the appropriate
    action to ensure jobs in flight are transitioned to
    RESTARTABLE state.
    
    The fix for this APAR is currently targeted for inclusion in
    fix packs 8.5.5.13 and 9.0.0.6.  Please refer to the
    Recommended Updates page for delivery information:
    http://www.ibm.com/support/docview.wss?rs=180&uid=swg27004980
    

Temporary fix

Comments

APAR Information

  • APAR number

    PI81527

  • Reported component name

    WEBS APP SERV N

  • Reported component ID

    5724H8800

  • Reported release

    850

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2017-05-12

  • Closed date

    2017-10-05

  • Last modified date

    2017-10-05

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

  • Fixed component name

    WEBS APP SERV N

  • Fixed component ID

    5724H8800

Applicable component levels

  • R850 PSY

       UP

  • R900 PSY

       UP

[{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEQTP","label":"WebSphere Application Server"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"8.5","Line of Business":{"code":"LOB45","label":"Automation"}}]

Document Information

Modified date:
03 May 2022