IBM Support

II10181: INFO/MANAGEMENT V6 MULTISYSTEM DATABASE ACCESS (MSDA) USERS HANG

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as canceled.

Error description

  • +--------------------------------------------------------------+
    | Error description abstract:                                  |
    |                                                              |
    | Info/Management V6 Multisystem Database Access (MSDA) users  |
    | hang.  Last updated:  01/27/97 PJL                           |
    +--------------------------------------------------------------+
    | Detailed description of APAR:                                |
    |                                                              |
    | Info/Management V6 MSDA user address spaces can hang due to  |
    | recalling of migrated data sets, SVC dumps being taken by    |
    | the BLX-SP, problems in Info, VSAM, APPC, or VTAM code, or   |
    | MVS performance problems.  Often the operation personnel     |
    | attempt to correct the problem by canceling users or using   |
    | the FORCE command.  This often leads to more problems.  This |
    | Informational APAR explains why some hangs occur, what you   |
    | should do to prevent the hangs, and what documentation Level |
    | 2 will need to diagnose the hang if the reason for the hang  |
    | is not apparent.  Only hangs specific to MSDA are discussed  |
    | here; see Informational APAR II08029 for a discussion of     |
    | hangs that may affect any V6 user, regardles of whether or   |
    | not MSDA is being run.                                       |
    |                                                              |
    +--------------------------------------------------------------+
    | Technical information:                                       |
    |                                                              |
    | What does the BLX-SP do?                                     |
    |                                                              |
    | The BLX-SP handles allocations, de-allocations, physical     |
    | opens and closes for VSAM data sets, operator commands and   |
    | VSAM requests that must be performed in task mode.  The TCBs |
    | that handle these functions are discussed in II08029.  For   |
    | MSDA, there are three more TCBs which handle VSAM buffer     |
    | invalidation and control the APPC conversations used to send |
    | buffer invalidation requests between the sharing systems:    |
    |                                                              |
    | BLXCWTSK Waits for a reply to buffer invalidation requests   |
    |          that were generated by the retry task (BLXCRTSK).   |
    |                                                              |
    | BLXCBIST Sends buffer invalidation requests to partner       |
    |          BLX-SPs, and receives replies when the buffer       |
    |          invalidation is complete.                           |
    |                                                              |
    | BLXCBIRP Receives buffer invalidation requests from partner  |
    |          BLX-SPs, performs buffer invalidation via the VSAM  |
    |          MRKBFR macro and handles VSI updates, and sends     |
    |          replies when theses requests have been completed.   |
    |                                                              |
    | All the above tasks have a queue on which requests can be    |
    | placed.  The tasks all run concurrently, but each task is    |
    | synchronous.  Each task will process one request at a time,  |
    | however they can have many requests waiting on their queues. |
    |                                                              |
    +--------------------------------------------------------------+
    | How hangs happen:                                            |
    |                                                              |
    | When running MSDA, a hang most commonly occurs when the      |
    | BLX-SP is waiting on a response from APPC/MVS, which is not  |
    | responding for some reason.  When this happens, BLXCBIST is  |
    | waiting for APPC, and in turn the user address spaces are    |
    | waiting either for an ECB to be directly posted by BLXCBIST  |
    | or for completion of a request that's been queued up for     |
    | BLXCWTSK (which is itself waiting for a post from BLXCBIST). |
    | A user waits on these ECBs just prior to releasing an        |
    | exclusive enqueue on the BLXDASDS resource, and so by        |
    | hanging when this enqueue is still held this user can cause  |
    | all subsequent requests for access to the dataset to hang    |
    | waiting for an enqueue.  Such a hang may be due to a         |
    | temporary network problem on one or more of the sharing      |
    | systems, an APPC or MVS performance problem on one or more   |
    | of the systems, or problems in the APPC or VTAM code (thus   |
    | far there are no known INFO problems that could cause such a |
    | hang).                                                       |
    |                                                              |
    +--------------------------------------------------------------+
    | How to prevent the hangs:                                    |
    |                                                              |
    | The following PTFs for Info V6 must be installed to prevent  |
    | known problems that result in users hanging:                 |
    |                                                              |
    |     V6.1  UW16593 UW29712                                    |
    |     V6.2  UW29713                                            |
    |                                                              |
    | In addition, all HIPER maintenance for the following         |
    | products must be installed:                                  |
    |                                                              |
    |     APPC/MVS                                                 |
    |     VTAM                                                     |
    |     MVS/ESA                                                  |
    |     DFSMS                                                    |
    |                                                              |
    | The following actions should be taken:                       |
    |                                                              |
    | 1.  Ensure that the APPC and ASCH started tasks are assigned |
    |     to a performance group that gives them sufficient        |
    |     priority to complete their work.  In general, they       |
    |     should have a priority equivalent to the other started   |
    |     tasks on the system.                                     |
    | 2.  Ensure that the BIAS TP has sufficient priority to       |
    |     accomplish its task by setting up a performance group    |
    |     for it as documented in the APPC Planning book in the    |
    |     "Fine Tuning through SRM Parameters" section of the      |
    |     "APPC/MVS Measurement and Tuning" chapter.               |
    | 3.  Consider making the BLXCBIAS program nonswappable as     |
    |     documented in the MSDA setup chapter of the Info V6      |
    |     Planning and Install Guide.                              |
    +--------------------------------------------------------------+
    | What to do if a hang occurs:                                 |
    |                                                              |
    | Canceling or forcing the BLX-SP, and restarting the BLX-SP   |
    | is often the only relief for a hang.  This should be done    |
    | only after the appropriate documentation has been gathered   |
    | for problem determination, as discussed below.               |
    |                                                              |
    | Sometimes simply stopping and restarting either the APPC     |
    | started task or perhaps a malfunctioning APPC LU on one or   |
    | all systems can clear up temporary network/APPC glitches.    |
    | If this doesn't help, continue with the steps that follow.   |
    |                                                              |
    | If a hang occurs, it is necessary to determine why it        |
    | occurred.                                                    |
    |                                                              |
    | If dumping or recall processing is not the cause, locate a   |
    | user that is waiting, holding an exclusive ENQ.  Attempt to  |
    | cancel that user.  Do not force that user or any user.  If   |
    | that user is in 'must complete' status, then this is not an  |
    | APPC hang, and refer to II08029 instead.  If the hang does   |
    | not clear after a period of time, do the following:          |
    |                                                              |
    | 1.  If you have the recommended PTFs installed, then use the |
    |     operator dump command and dump the server, and as best   |
    |     you can tell, the user who is holding the ENQ (not       |
    |     waiting for the ENQ).                                    |
    | 2.  Also dump each BLX-SP that shares datasets with the      |
    |     BLX-SP that the user is hung on.                         |
    | 3.  Dump all of the BIAS jobs on each of the shared systems. |
    | 4.  Issue the "D APPC,TP,ALL" command on each system and     |
    |     save the output.  This is needed to sort out which BIAS  |
    |     jobs are talking to which BLX-SPs.                       |
    | 5.  An APPC component trace should be obtained on each of    |
    |     the shared systems.  To get the trace, do the following  |
    |     on each system in the complex:                           |
    |     a.  On the system console:                               |
    |             TRACE CT,ON,COMP=SYSAPPC                         |
    |     b.  Then you'll get the following message:               |
    |             xx ITT006A SPECIFY OPERAND(S) FOR TRACE CT       |
    |         COMMAND.                                             |
    |     c.  Enter:                                               |
    |             R xx,OPTIONS=(GLOBAL),END                        |
    |     d.  Your trace is now running.  Recreate the hang, and   |
    |         then enter the following:                            |
    |             TRACE CT,OFF,COMP=SYSAPPC                        |
    |     e.  This will stop the trace and cause APPC to issue an  |
    |         SVC Dump.                                            |
    | 6.  Try canceling the BLX-SP.                                |
    | 7.  If the BLX-SP does not shutdown after the SHUTDOWN wait  |
    |     time, use the FORCE command on the BLX-SP.  Understand   |
    |     before doing so that you may have to IPL.                |
    |                                                              |
    |     NOTE:  There is a chance that data base corruption could |
    |     occur if the BLX-SP is forced.  However, this is a very  |
    |     small risk, and generally the database can be restored   |
    |     successfully by going to the most recent backup and then |
    |     running BLGUT3 to restore the latest updates from the    |
    |     data in the SDLDS.                                       |
    | 8.  After the BLX-SP address space is down, verify that any  |
    |     user address spaces that were hung are no longer hung.   |
    |     Have those users logoff of TSO.  If there are still some |
    |     hung users, then force those users off.                  |
    | 9.  Start the BLX-SP back up.  Users can now get back into   |
    |     Info.                                                    |
    |                                                              |
    | If you have the PTFs installed, contact the Support Center   |
    | in your country and open a problem record to have the dumps  |
    | reviewed.  If you do not have the PTFs on, put them on.  If  |
    | you have not done the recommended performance tuning, do it. |
    | If you don't have the dumps then contacting the Support      |
    | Center will only result in a request that you get the dumps. |
    +--------------------------------------------------------------+
    | Search keywords:  INFO RALINFO INFOMAN INFOSYS INFO/MAN      |
    | INFORMATION MANAGEMENT V61 569517100 INFOINFO GT0405         |
    | LVLS/101 LVLS/201 LVLS/301 V62 V63                           |
    +--------------------------------------------------------------+
    

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

  • ralinfoapar
    

APAR Information

  • APAR number

    II10181

  • Reported component name

    V2 LIB INFO ITE

  • Reported component ID

    INFOV2LIB

  • Reported release

    001

  • Status

    CLOSED CAN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt

  • Submitted date

    1997-01-27

  • Closed date

    1997-01-27

  • Last modified date

    1997-01-27

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19N","label":"APARs - OS\/390 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":null,"label":null},"Product":{"code":"SG19O","label":"APARs - MVS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSSN3L","label":"z\/OS Communications Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
27 January 1997