APAR status
Closed as canceled.
Error description
+--------------------------------------------------------------+ | Error description abstract: | | | | Info/Management V6 users hang. Last updated: 01/27/97 PJL | +--------------------------------------------------------------+ | Detailed description of APAR: | | | | Info/Management V6 user address spaces can hang due to | | recalling of migrated data sets, SVC dumps being taken by | | the BLX-SP, problems in Info, VSAM, APPC, or VTAM code, or | | MVS performance problems. Often the operation personnel | | attempt to correct the problem by canceling users or using | | the FORCE command. This often leads to more problems. This | | Informational APAR explains why some hangs occur, what you | | should do to prevent the hangs, and what documentation Level | | 2 will need to diagnose the hang if the reason for the hang | | is not apparent. This APAR discusses only non-MSDA hangs; | | if you are running with Multisystem Database Access (aka | | MSDA or shared DASD), please refer to Informational APAR | | II10181. | | | +--------------------------------------------------------------+ | Technical information: | | | | What does the BLX-SP do? | | | | The BLX-SP handles allocations, de-allocations, physical | | opens and closes for VSAM data sets, operator commands and | | VSAM requests that must be performed in task mode. | | | | The BLX-SP has several tasks (TCBs). They are: | | | | BLXUCONN Handles all connects and disconnects. A user in | | this case is a batch job, interactive user, API or | | BLG utility. | | | | BLXCVTSK Handles all allocation and de-allocation requests | | for any VSAM data set (SDIDS, SDDS, SDLDS, panels | | and dictionary) that a user requested. The VSAM | | data sets can be LSR or NSR. The request could | | also be from an operator command (REALLOC, FREE or | | ADDVDEF). BLXCVTSK does all physical opens and | | closes of VSAM data sets. | | | | BLXCRTSK Handles all TASK mode RPLs (CA splits and EOV | | processing). | | | | BLXSIN01 Does initialization of the BLX-SP, and runs under | | control of the JOBSTEP TCB. It monitors the status | | of all of the BLX-SP sub-tasks and receives | | operator commands as they are entered. | | | | BLXSTLW Serializes writing of messages to the Trace/Log. | | | | Three additional TCBs exist when running MSDA which won't be | | discussed here. | | | | All the above tasks except BLXSIN01 have a queue on which | | requests can be placed. The tasks all run concurrently, but | | each task is synchronous. Each task will process one | | request at a time, however they can have many requests | | waiting on their queues. BLXSIN01 does not have a queue. | | It can be processing one command, and MVS can hold one | | additional command in the command stack. A third command | | entered before the first command has completed is rejected. | | The operator will get 'Task Busy - Modify Rejected' from | | MVS. | | | +--------------------------------------------------------------+ | How hangs happen: | | | | Assume that a user address space is connected to the BLX-SP | | and needs a record from one of the Info VSAM data sets. | | That user will obtain an ENQ on the data set, issue a MVS | | Status macro to set 'must complete' mode, then do a PC | | instruction to go cross memory to the BLX-SP address space. | | Then the VSAM RPL based request is processed under the cross | | memory users TCB. The BLX-SP tasks are unaware of these | | requests. If the request is to GET a record, and the record | | is in an LSR buffer, the user will never actually wait. If | | the record is not available, VSAM puts the user into a wait. | | When the request is processed, VSAM calls a UPAD exit (Info | | supplied). The UPAD exit posts the user. The user returns | | back to their home address space and issues another Status | | macro to turn off 'must complete'. Then the user DEQs from | | the data set. This is normal processing, and the time the | | user was in 'must complete' mode is very short. | | | | Keep in mind that the entire time a user is in cross memory | | mode they are holding an ENQ and are in MUST COMPLETE mode. | | This means if they hang (are placed into a wait) they cannot | | be canceled. Also, since they hold the ENQ, other (not yet | | in cross memory) users can be put into a wait for that same | | resource (Info uses the WAIT option on the ENQ macro). | | Whether the user will wait depends on if they want share or | | exclusive use, and what other ENQs are active for that | | resource. | | | | Example: | | | | 1. Assume user A has a shared ENQ on the SDDS and is in a | | cross memory wait for VSAM to get a requested record. | | 2. User B also wants to get a record. User B requests and | | receives a shared ENQ on the SDDS. User B then goes | | cross memory and waits for VSAM to complete the request. | | 3. User C requests an exclusive ENQ on the SDDS so that a | | SDDS record can be updated. User C waits in his home | | address space since he must get the exclusive ENQ before | | he can go cross memory. | | 4. User B's request is posted by VSAM (using Info's UPAD | | exit), and he returns to his home address space and DEQs | | from the SDDS. | | 5. User B does another request that needs a shared ENQ. | | This time he waits because user C wants the SDDS | | exclusive. No new shares will be allowed until C's | | exclusive is processed. User B is waiting in his home | | address space. | | 6. All other users, share or exclusive, will wait in their | | home address space when they attempt to ENQ on the SDDS. | | Any user waiting in their address space for an ENQ can | | be canceled since they do not enter 'must complete' | | status until they have the ENQ. | | | | If user A's request is never posted complete, soon every | | user on the BLX-SP will hang when they attempt to read or | | write to the SDDS. User C can be canceled, but that will | | only free things up until another user wants the SDDS | | exclusive. User A can't be canceled, since he is in 'must | | complete' status. Even if User A is forced, that will | | likely not cure the reason user A's hang. VSAM will have | | control blocks pointing to user A, and user A will not be | | removed from any queue in the BLX-SP that he is on. Also | | the user has obtained various BLX-SP control blocks from a | | pool of control blocks that exists in the BLX-SP address | | space. Unless the user completes, these control blocks are | | not returned to the pool. | | | | Most hangs of this type were fixed by V5 PTFs which are in | | the V6 base code, with two exceptions which are noted below. | | Generally hangs of this type were related to the wrong ECB | | being posted or incorrect processing by the UPAD exit. | | Canceling or forcing the BLX-SP and then restarting the | | BLX-SP is often the only relief for a hang of this type. | | | | The BLX-SP can hang for several minutes when dumps are being | | taken. If a cross memory user ABENDs while in cross memory | | mode in the BLX-SP address space, the BLX-SP will be | | non-dispatchable while the dump is taken. Often, several | | users will ABEND back-to-back, so the hang taking dumps can | | be very long (20 minutes or more). SYSLOG should always be | | reviewed to see if this is the case. There is nothing you | | can do except let the dumps complete, and correct the reason | | that caused the abend. | | | | Another hang condition that has been seen is related to a | | data set being recalled. | | | | 1. User A attempts to access a data set that has been | | migrated. A test SDDS/SDIDS maybe. | | 2. BLXCVTSK allocates the data set with an SVC99. | | 3. BLXCVTSK issues the OPEN which causes DFHSM to recall | | the data set. BLXCVTSK is placed into a wait until the | | recall completes. | | 4. Some other user quits out of Info normally. This causes | | BLXUCONN to place a request on BLXCVTSK queue to free | | the user from any of the Info VSAM data sets the user | | had. Had the user ABENDed out of Info, recovery | | routines would place a disconnect request on BLXUCONN | | for the user. | | 5. BLXUCONN waits for BLXCVTSK to complete. Now, any new | | users attempting to connect or disconnect from Info will | | be queued to BLXUCONN and will wait until BLXUCONN | | completes the disconnect for the prior user. This is | | why users cannot get into or out of Info when a hang | | occurs. | | | | If there are not enough strings (placeholders, STRNO) | | specified on the BLDVRP MACRO in the BLXVDEF VSAM resource | | macro for a resource pool (share pool, shrpool), a hang may | | result. | | | | You can determine if your hang is related to the number of | | strings specified by issuing the following operator command: | | /F blx_name,QUERY | | | | Review the output from the operator command in the BLX-SP | | log or the system log. You will see the following group of | | messages for each data set defined in the BLXVDEF: | | BLX03209I File: BLXL0007 Data set: BLM.V6R1M0.IBMPNLS | | BLX03221I USERS ACTIVE RESPOOL PHMAXW PHMAXU | | BLX03222I 0 0 5 0 1 | | | | If PHMAXW is not 0 (zero) for every data set, then users | | have waited for strings. Users are hung while they are | | waiting for strings. The higher the number of waits | | (PHMAXW), the more noticeable the INFO hang will be. As | | strings become available, users who are waiting will | | eventually receive a string and therefore no longer be hung. | | The only way to really fix the problem is to assign more | | strings to the affected resource pools, relinkedit the | | BLXVDEF, and stop and start the BLX-SP to pick up the | | changes to the BLXVDEF. Make sure that the number of | | strings specified for a resource pool is at least as large | | as the number of data sets in the resource pool plus 2 | | (two), or plus 3 (three) when running MSDA. | | | | All users that end up being placed on any of the queues for | | one of the BLX-SP's tasks are placed into a wait. They are | | in 'must complete' status and can not be canceled. If they | | are forced, this will get rid of the user's address space, | | but it will not remove the user from any BLX-SP queue they | | are on. If the BLX-SP task never completes, then everything | | will remain hung no matter how many user address spaces are | | forced. If in due time the BLX-SP task does complete, then | | the BLX-SP task will ABEND attempting to post a user address | | space that is no longer there. This will result in a SVC | | dump being taken, which as stated above, causes the hang to | | be even longer. | | | +--------------------------------------------------------------+ | How to prevent the hangs: | | | | The following PTFs for Info V6 must be installed to prevent | | known problems that result in users hanging: | | | | V6.1 UW16593 UW29712 | | V6.2 UW29713 | | | | In addition, all HIPER maintenance for both MVS/ESA and | | DFSMS/MVS must be installed. | | | +--------------------------------------------------------------+ | What to do if a hang occurs: | | | | Canceling or forcing the BLX-SP, and restarting the BLX-SP | | is often the only relief for a hang. This should be done | | only after the appropriate documentation has been gathered | | for problem determination, as discussed below. | | | | If a hang occurs, it is necessary to determine why it | | occurred. If it is due to a dump or recalling of a data | | set, then SYSLOG should indicate this. No actions should be | | performed in these cases. Riding out the hang will probably | | be the best. | | | | If dumping or recall processing is not the cause, locate a | | user that is waiting, holding an exclusive ENQ. Attempt to | | cancel that user. Do not force that user or any user. If | | that user is in 'must complete' status, wait awhile or look | | for another user with an exclusive ENQ. Cancel that user if | | possible. If the hang does not clear after a period of | | time, do the following: | | | | 1. If you have the recommended PTFs installed, then use the | | operator dump command and dump the server, and as best | | you can tell, the user who is holding the ENQ (not | | waiting for the ENQ). | | 2. Try canceling the BLX-SP. | | 3. If the BLX-SP does not shutdown after the SHUTDOWN wait | | time, use the FORCE command on the BLX-SP. Understand | | before doing so that you may have to IPL. | | | | NOTE: There is a chance that data base corruption could | | occur if the BLX-SP is forced. However, this is a very | | small risk, and generally the database can be restored | | successfully by going to the most recent backup and then | | running BLGUT3 to restore the latest updates from the | | data in the SDLDS. | | 4. After the BLX-SP address space is down, verify that any | | user address spaces that were hung are no longer hung. | | Have those users logoff of TSO. If there are still some | | hung users, then force those users off. | | 5. Start the BLX-SP back up. Users can now get back into | | Info. | | | | If you have the PTFs installed, contact the Support Center | | in your country and open a problem record to have the dumps | | reviewed. If you do not have the PTFs on, put them on. If | | you don't have the dumps then contacting the Support Center | | will only result in a request that you get the dumps. | +--------------------------------------------------------------+ | Search keywords: INFO RALINFO INFOMAN INFOSYS INFO/MAN | | INFORMATION MANAGEMENT V51 V61 569517100 569506500 INFOINFO | | GT0405 LVLS/101 LVLS/102 LVLS/201 LVLS/301 V62 V63 | | V11 V12 V71 564814200 5697SD900 | +--------------------------------------------------------------+
Local fix
Problem summary
Problem conclusion
Temporary fix
Comments
Caryinfoapar
APAR Information
APAR number
II08029
Reported component name
V2 LIB INFO ITE
Reported component ID
INFOV2LIB
Reported release
001
Status
CLOSED CAN
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
1994-07-01
Closed date
1994-07-07
Last modified date
2003-07-10
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Applicable component levels
[{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19N","label":"APARs - OS\/390 environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SG19M","label":"APARs - z\/OS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":null,"label":null},"Product":{"code":"SG19O","label":"APARs - MVS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"SSSN3L","label":"z\/OS Communications Server"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"LOB35","label":"Mainframe SW"}}]
Document Information
Modified date:
14 December 2020