IBM Support

II08029: INFO/MANAGEMENT BLXSP AND USER HANGS - WHY THEY HAPPEN

Subscribe

You can track all active APARs for this component.

 

APAR status

  • Closed as canceled.

Error description

  • +--------------------------------------------------------------+
    | Error description abstract:                                  |
    |                                                              |
    | Info/Management V6 users hang.  Last updated:  01/27/97 PJL  |
    +--------------------------------------------------------------+
    | Detailed description of APAR:                                |
    |                                                              |
    | Info/Management V6 user address spaces can hang due to       |
    | recalling of migrated data sets, SVC dumps being taken by    |
    | the BLX-SP, problems in Info, VSAM, APPC, or VTAM code, or   |
    | MVS performance problems.  Often the operation personnel     |
    | attempt to correct the problem by canceling users or using   |
    | the FORCE command.  This often leads to more problems.  This |
    | Informational APAR explains why some hangs occur, what you   |
    | should do to prevent the hangs, and what documentation Level |
    | 2 will need to diagnose the hang if the reason for the hang  |
    | is not apparent.  This APAR discusses only non-MSDA hangs;   |
    | if you are running with Multisystem Database Access (aka     |
    | MSDA or shared DASD), please refer to Informational APAR     |
    | II10181.                                                     |
    |                                                              |
    +--------------------------------------------------------------+
    | Technical information:                                       |
    |                                                              |
    | What does the BLX-SP do?                                     |
    |                                                              |
    | The BLX-SP handles allocations, de-allocations, physical     |
    | opens and closes for VSAM data sets, operator commands and   |
    | VSAM requests that must be performed in task mode.           |
    |                                                              |
    | The BLX-SP has several tasks (TCBs).  They are:              |
    |                                                              |
    | BLXUCONN Handles all connects and disconnects.  A user in    |
    |          this case is a batch job, interactive user, API or  |
    |          BLG utility.                                        |
    |                                                              |
    | BLXCVTSK Handles all allocation and de-allocation requests   |
    |          for any VSAM data set (SDIDS, SDDS, SDLDS, panels   |
    |          and dictionary) that a user requested.  The VSAM    |
    |          data sets can be LSR or NSR.  The request could     |
    |          also be from an operator command (REALLOC, FREE or  |
    |          ADDVDEF).  BLXCVTSK does all physical opens and     |
    |          closes of VSAM data sets.                           |
    |                                                              |
    | BLXCRTSK Handles all TASK mode RPLs (CA splits and EOV       |
    |          processing).                                        |
    |                                                              |
    | BLXSIN01 Does initialization of the BLX-SP, and runs under   |
    |          control of the JOBSTEP TCB.  It monitors the status |
    |          of all of the BLX-SP sub-tasks and receives         |
    |          operator commands as they are entered.              |
    |                                                              |
    | BLXSTLW  Serializes writing of messages to the Trace/Log.    |
    |                                                              |
    | Three additional TCBs exist when running MSDA which won't be |
    | discussed here.                                              |
    |                                                              |
    | All the above tasks except BLXSIN01 have a queue on which    |
    | requests can be placed.  The tasks all run concurrently, but |
    | each task is synchronous.  Each task will process one        |
    | request at a time, however they can have many requests       |
    | waiting on their queues.  BLXSIN01 does not have a queue.    |
    | It can be processing one command, and MVS can hold one       |
    | additional command in the command stack.  A third command    |
    | entered before the first command has completed is rejected.  |
    | The operator will get 'Task Busy - Modify Rejected' from     |
    | MVS.                                                         |
    |                                                              |
    +--------------------------------------------------------------+
    | How hangs happen:                                            |
    |                                                              |
    | Assume that a user address space is connected to the BLX-SP  |
    | and needs a record from one of the Info VSAM data sets.      |
    | That user will obtain an ENQ on the data set, issue a MVS    |
    | Status macro to set 'must complete' mode, then do a PC       |
    | instruction to go cross memory to the BLX-SP address space.  |
    | Then the VSAM RPL based request is processed under the cross |
    | memory users TCB.  The BLX-SP tasks are unaware of these     |
    | requests.  If the request is to GET a record, and the record |
    | is in an LSR buffer, the user will never actually wait.  If  |
    | the record is not available, VSAM puts the user into a wait. |
    | When the request is processed, VSAM calls a UPAD exit (Info  |
    | supplied).  The UPAD exit posts the user.   The user returns |
    | back to their home address space and issues another Status   |
    | macro to turn off 'must complete'.  Then the user DEQs from  |
    | the data set.  This is normal processing, and the time the   |
    | user was in 'must complete' mode is very short.              |
    |                                                              |
    | Keep in mind that the entire time a user is in cross memory  |
    | mode they are holding an ENQ and are in MUST COMPLETE mode.  |
    | This means if they hang (are placed into a wait) they cannot |
    | be canceled.  Also, since they hold the ENQ, other (not yet  |
    | in cross memory) users can be put into a wait for that same  |
    | resource (Info uses the WAIT option on the ENQ macro).       |
    | Whether the user will wait depends on if they want share or  |
    | exclusive use, and what other ENQs are active for that       |
    | resource.                                                    |
    |                                                              |
    | Example:                                                     |
    |                                                              |
    | 1.  Assume user A has a shared ENQ on the SDDS and is in a   |
    |     cross memory wait for VSAM to get a requested record.    |
    | 2.  User B also wants to get a record.  User B requests and  |
    |     receives a shared ENQ on the SDDS.  User B then goes     |
    |     cross memory and waits for VSAM to complete the request. |
    | 3.  User C requests an exclusive ENQ on the SDDS so that a   |
    |     SDDS record can be updated.  User C waits in his home    |
    |     address space since he must get the exclusive ENQ before |
    |     he can go cross memory.                                  |
    | 4.  User B's request is posted by VSAM (using Info's UPAD    |
    |     exit), and he returns to his home address space and DEQs |
    |     from the SDDS.                                           |
    | 5.  User B does another request that needs a shared ENQ.     |
    |     This time he waits because user C wants the SDDS         |
    |     exclusive. No new shares will be allowed until C's       |
    |     exclusive is processed.  User B is waiting in his home   |
    |     address space.                                           |
    | 6.  All other users, share or exclusive, will wait in their  |
    |     home address space when they attempt to ENQ on the SDDS. |
    |     Any user waiting in their address space for an ENQ can   |
    |     be canceled since they do not enter 'must complete'      |
    |     status until they have the ENQ.                          |
    |                                                              |
    | If user A's request is never posted complete, soon every     |
    | user on the BLX-SP will hang when they attempt to read or    |
    | write to the SDDS.  User C can be canceled, but that will    |
    | only free things up until another user wants the SDDS        |
    | exclusive.  User A can't be canceled, since he is in 'must   |
    | complete' status.  Even if User A is forced, that will       |
    | likely not cure the reason user A's hang.  VSAM will have    |
    | control blocks pointing to user A, and user A will not be    |
    | removed from any queue in the BLX-SP that he is on.  Also    |
    | the user has obtained various BLX-SP control blocks from a   |
    | pool of control blocks that exists in the BLX-SP address     |
    | space.  Unless the user completes, these control blocks are  |
    | not returned to the pool.                                    |
    |                                                              |
    | Most hangs of this type were fixed by V5 PTFs which are in   |
    | the V6 base code, with two exceptions which are noted below. |
    | Generally hangs of this type were related to the wrong ECB   |
    | being posted or incorrect processing by the UPAD exit.       |
    | Canceling or forcing the BLX-SP and then restarting the      |
    | BLX-SP is often the only relief for a hang of this type.     |
    |                                                              |
    | The BLX-SP can hang for several minutes when dumps are being |
    | taken.  If a cross memory user ABENDs while in cross memory  |
    | mode in the BLX-SP address space, the BLX-SP will be         |
    | non-dispatchable while the dump is taken.  Often, several    |
    | users will ABEND back-to-back, so the hang taking dumps can  |
    | be very long (20 minutes or more).  SYSLOG should always be  |
    | reviewed to see if this is the case.  There is nothing you   |
    | can do except let the dumps complete, and correct the reason |
    | that caused the abend.                                       |
    |                                                              |
    | Another hang condition that has been seen is related to a    |
    | data set being recalled.                                     |
    |                                                              |
    | 1.  User A attempts to access a data set that has been       |
    |     migrated.  A test SDDS/SDIDS maybe.                      |
    | 2.  BLXCVTSK allocates the data set with an SVC99.           |
    | 3.  BLXCVTSK issues the OPEN which causes DFHSM to recall    |
    |     the data set.  BLXCVTSK is placed into a wait until the  |
    |     recall completes.                                        |
    | 4.  Some other user quits out of Info normally.  This causes |
    |     BLXUCONN to place a request on BLXCVTSK queue to free    |
    |     the user from any of the Info VSAM data sets the user    |
    |     had.  Had the user ABENDed out of Info, recovery         |
    |     routines would place a disconnect request on BLXUCONN    |
    |     for the user.                                            |
    | 5.  BLXUCONN waits for BLXCVTSK to complete.  Now, any new   |
    |     users attempting to connect or disconnect from Info will |
    |     be queued to BLXUCONN and will wait until BLXUCONN       |
    |     completes the disconnect for the prior user.  This is    |
    |     why users cannot get into or out of Info when a hang     |
    |     occurs.                                                  |
    |                                                              |
    | If there are not enough strings (placeholders, STRNO)        |
    | specified on the BLDVRP MACRO in the BLXVDEF VSAM resource   |
    | macro for a resource pool (share pool, shrpool), a hang may  |
    | result.                                                      |
    |                                                              |
    | You can determine if your hang is related to the number of   |
    | strings specified by issuing the following operator command: |
    |      /F blx_name,QUERY                                       |
    |                                                              |
    | Review the output from the operator command in the BLX-SP    |
    | log or the system log.  You will see the following group of  |
    | messages for each data set defined in the BLXVDEF:           |
    |      BLX03209I File: BLXL0007 Data set: BLM.V6R1M0.IBMPNLS   |
    |      BLX03221I      USERS ACTIVE RESPOOL PHMAXW PHMAXU       |
    |      BLX03222I          0      0       5       0     1       |
    |                                                              |
    | If PHMAXW is not 0 (zero) for every data set, then users     |
    | have waited for strings.  Users are hung while they are      |
    | waiting for strings.  The higher the number of waits         |
    | (PHMAXW), the more noticeable the INFO hang will be.  As     |
    | strings become available, users who are waiting will         |
    | eventually receive a string and therefore no longer be hung. |
    | The only way to really fix the problem is to assign more     |
    | strings to the affected resource pools, relinkedit the       |
    | BLXVDEF, and stop and start the BLX-SP to pick up the        |
    | changes to the BLXVDEF.  Make sure that the number of        |
    | strings specified for a resource pool is at least as large   |
    | as the number of data sets in the resource pool plus 2       |
    | (two), or plus 3 (three) when running MSDA.                  |
    |                                                              |
    | All users that end up being placed on any of the queues for  |
    | one of the BLX-SP's tasks are placed into a wait.  They are  |
    | in 'must complete' status and can not be canceled.  If they  |
    | are forced, this will get rid of the user's address space,   |
    | but it will not remove the user from any BLX-SP queue they   |
    | are on.  If the BLX-SP task never completes, then everything |
    | will remain hung no matter how many user address spaces are  |
    | forced.  If in due time the BLX-SP task does complete, then  |
    | the BLX-SP task will ABEND attempting to post a user address |
    | space that is no longer there.  This will result in a SVC    |
    | dump being taken, which as stated above, causes the hang to  |
    | be even longer.                                              |
    |                                                              |
    +--------------------------------------------------------------+
    | How to prevent the hangs:                                    |
    |                                                              |
    | The following PTFs for Info V6 must be installed to prevent  |
    | known problems that result in users hanging:                 |
    |                                                              |
    |     V6.1  UW16593 UW29712                                    |
    |     V6.2  UW29713                                            |
    |                                                              |
    | In addition, all HIPER maintenance for both MVS/ESA and      |
    | DFSMS/MVS must be installed.                                 |
    |                                                              |
    +--------------------------------------------------------------+
    | What to do if a hang occurs:                                 |
    |                                                              |
    | Canceling or forcing the BLX-SP, and restarting the BLX-SP   |
    | is often the only relief for a hang.  This should be done    |
    | only after the appropriate documentation has been gathered   |
    | for problem determination, as discussed below.               |
    |                                                              |
    | If a hang occurs, it is necessary to determine why it        |
    | occurred.  If it is due to a dump or recalling of a data     |
    | set, then SYSLOG should indicate this.  No actions should be |
    | performed in these cases.  Riding out the hang will probably |
    | be the best.                                                 |
    |                                                              |
    | If dumping or recall processing is not the cause, locate a   |
    | user that is waiting, holding an exclusive ENQ.  Attempt to  |
    | cancel that user.  Do not force that user or any user.  If   |
    | that user is in 'must complete' status, wait awhile or look  |
    | for another user with an exclusive ENQ.  Cancel that user if |
    | possible.  If the hang does not clear after a period of      |
    | time, do the following:                                      |
    |                                                              |
    | 1.  If you have the recommended PTFs installed, then use the |
    |     operator dump command and dump the server, and as best   |
    |     you can tell, the user who is holding the ENQ (not       |
    |     waiting for the ENQ).                                    |
    | 2.  Try canceling the BLX-SP.                                |
    | 3.  If the BLX-SP does not shutdown after the SHUTDOWN wait  |
    |     time, use the FORCE command on the BLX-SP.  Understand   |
    |     before doing so that you may have to IPL.                |
    |                                                              |
    |     NOTE:  There is a chance that data base corruption could |
    |     occur if the BLX-SP is forced.  However, this is a very  |
    |     small risk, and generally the database can be restored   |
    |     successfully by going to the most recent backup and then |
    |     running BLGUT3 to restore the latest updates from the    |
    |     data in the SDLDS.                                       |
    | 4.  After the BLX-SP address space is down, verify that any  |
    |     user address spaces that were hung are no longer hung.   |
    |     Have those users logoff of TSO.  If there are still some |
    |     hung users, then force those users off.                  |
    | 5.  Start the BLX-SP back up.  Users can now get back into   |
    |     Info.                                                    |
    |                                                              |
    | If you have the PTFs installed, contact the Support Center   |
    | in your country and open a problem record to have the dumps  |
    | reviewed.  If you do not have the PTFs on, put them on.  If  |
    | you don't have the dumps then contacting the Support Center  |
    | will only result in a request that you get the dumps.        |
    +--------------------------------------------------------------+
    | Search keywords:  INFO RALINFO INFOMAN INFOSYS INFO/MAN      |
    | INFORMATION MANAGEMENT V51 V61 569517100 569506500 INFOINFO  |
    | GT0405 LVLS/101 LVLS/102 LVLS/201 LVLS/301 V62 V63           |
    | V11 V12 V71 564814200 5697SD900                              |
    +--------------------------------------------------------------+
    

Local fix

Problem summary

Problem conclusion

Temporary fix

Comments

  • Caryinfoapar
    

APAR Information

  • APAR number

    II08029

  • Reported component name

    V2 LIB INFO ITE

  • Reported component ID

    INFOV2LIB

  • Reported release

    001

  • Status

    CLOSED CAN

  • PE

    NoPE

  • HIPER

    NoHIPER

  • Special Attention

    NoSpecatt / Xsystem

  • Submitted date

    1994-07-01

  • Closed date

    1994-07-07

  • Last modified date

    2003-07-10

  • APAR is sysrouted FROM one or more of the following:

  • APAR is sysrouted TO one or more of the following:

Fix information

Applicable component levels

[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19N","label":"APARs - OS\/390 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":null,"label":null},"Product":{"code":"SG19O","label":"APARs - MVS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSSN3L","label":"z\/OS Communications Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]

Document Information

Modified date:
14 December 2020