IBM Support

IT07799: OLDER UNPROCESSED SRA JOB RESULTS ARE CAUSING "JSS0051L: SYSTEM ERROR OCCURRED" FOR NEW SRA JOBS.

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as program error.

Error description

  • Web-based GUI "servers" panels are slow to load.  This is caused
    by data server sluggishness brought about by older, unprocessed
    Storage Resource Agent (SRA) job results which still reside on
    the TPC server.
    SRA job logs created on the server will contain the error
    "JSS0051L: System error occurred"
    Data server Schedulerxxxx.log contains following sequence of
    messages:
    2015-03-08 01:41:01.329-0600 NAD0049I: Running probe on agent
    <agent host name>
    2015-03-08 01:41:02.702-0600 JSS0021E: Unable to process
    returned job number <old job run id>
    2015-03-08 01:41:03.419-0600 JSS0021E: Unable to process
    returned job number <old job run id>
    2015-03-08 01:41:04.121-0600 JSS0021E: Unable to process
    returned job number <old job run id>
    2015-03-08 01:41:04.839-0600 JSS0021E: Unable to process
    returned job number <old job run id>
    2015-03-08 01:41:39.394-0600 JSS0021E: Unable to process
    returned job number <old job run id>
    2015-03-08 01:41:40.221-0600 JSS0021E: Unable to process
    returned job number <old job run id>
    2015-03-08 01:47:01.943-0600 NAD0048E: Probe did not start
    successfully on agent <agent host nam>, error code returned =
    -16.
    2015-03-08 01:47:01.943-0600 JSS0046E: the job for computer
    "<agent host name>" in run 31 of
                             Probe "user.probename_<new job run id>"
    could not
                             be started due to an agent error
    RECREATE STEPS:  N/A
    ________________________________________________________________
    DB2 Version used for Server:    N/A
    The defect is against component:   5608TPC00
    Server/Manager build/release (TPC):   5.2.4
    
    ________________________________________________________________
    Problem as described by customer:    WebGUI servers panels slow
    to load
    Initial customer impact (low/med/high):   med
    

Local fix

  • patch available from TPC support upon request.
    

Problem summary

  • ****************************************************************
    * USERS AFFECTED:                                              *
    * Users of TPC Storage Resource Agents.                        *
    ****************************************************************
    * PROBLEM DESCRIPTION:                                         *
    * Data Server log directory contains SRA job logs containing   *
    * following error:                                             *
    * (Normally the SRA job logs should be located only on the SRA *
    * side)                                                        *
    *                                                              *
    * JSS0051L: System error occurred                              *
    *                                                              *
    * Data Server scheduler log contains multiple errors like the  *
    * ones below:                                                  *
    *                                                              *
    * NAD0006E Exception thrown for method receiveAcknowledge:     *
    * java.net.SocketTimeoutException: Read timed out.             *
    * NAD0048E: Probe did not start successfully on agent <agent   *
    * name>, error code returned = -16.                            *
    * JSS0046E: the job for computer <agent name> in run 69 of     *
    *                          Probe <probe name> could not        *
    *                          be started due to an agent error    *
    ****************************************************************
    * RECOMMENDATION:                                              *
    * Apply fix maintenance when available.                        *
    ****************************************************************
    

Problem conclusion

  • SRA Agent sent older job results to Data Sever when a new job
    request came in.
    SRA must acknowlege that it started processing a new job request
    but those older jobs delay it and Data Server issues the errors
    mentioned in Problem Description.
    
    This APAR moves the process of sending older jobs results in a
    place where they do not interfere with new job requests.
    
    The fix for this APAR is targeted for the following maintenance
    package:
    
    | refresh pack | 5.2-TIV-TPC-RP0007 - target August 2015
    
    http://www-01.ibm.com/support/docview.wss?&uid=swg21320822
    
    The target dates for future refresh packs do not represent a
    formal commitment by IBM. The dates are subject to change
    without notice.
    

Temporary fix

Comments

APAR Information

  • APAR number

    IT07799

  • Reported component name

    TPC

  • Reported component ID

    5608TPC00

  • Reported release

    524

  • Status

    CLOSED PER

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    2015-03-19

  • Closed date

    2015-05-18

  • Last modified date

    2015-05-18

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Modules/Macros

  • SRA
    

Fix information

  • Fixed component name

    TPC

  • Fixed component ID

    5608TPC00

Applicable component levels

[{"Business Unit":{"code":"BU058","label":"IBM Infrastructure w\/TPS"},"Product":{"code":"SSWFB4","label":"IBM Spectrum Control Standard Edition"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"524","Line of Business":{"code":"LOB26","label":"Storage"}}]

Document Information

Modified date:
05 October 2023