APAR status
Closed as canceled.
Error description
---============================================--- = for releases above R610 see APAR II14016 = ---============================================--- (last update 05/23/02 lsk) ************************************************************** The best way to get documention is using parmlib member IEADMCxx Dump parmlib support exists in OS/390 R2.5 and up where the DUMP commands can be pre-setup in parmlib members ( IEADMCxx ) similar to how the SLIP traps were used in the original MVS 5.1.0 timeframe. The DUMP parmlib avoids the use of SLIP IF (PER) which is limited to only one PER trap per system. IEADMCxx, where "xx" is the suffix you specify on the PARMLIB= operand of the DUMP command. ************************************************************** The purpose of this APAR is to document the known DB2 R710 R610 R510 - 5740XYR00 HANG-WAIT-SUSPEND problems. For HANG or WAIT problems in DB2 DISTRIBUTED asid also see II08215. Take note, if your problem is the failure of DB2 to startup, then remember, DB2 does NOT function without IRLM. Verify that your IRLM can function/startup without DB2. If IRLM is in an indeterminate state wherein it cannot IDENTIFY the DB2, then DB2 cannot fully startup. Check your SYSLOG for errors related to IRLM or look for DXRxxxx type messages. In addition to a current fix-list, a process has been provided that indicates what to do when DB2 or an ALLIED asid is hung (or looping). Please follow this process and have the listed documentation available for DB2 SUPPORT analysis. (**NOTE** DB2 R410 supplies a new CANCEL THREAD command) ---------------------------------------------------------------- WHAT TO DO IF DB2, OR DB2 ALLIED ASID IS HUNG (REFERENCE PAGE 80 OF THE DB2 R310 DIAGNOSIS GUIDE) ALL OF THE BELOW DOCUMENTATION WILL BE NEEDED BY DB2 LEVEL2 ---------------------------------------------------------------- 1) Displays: -DISPLAY THREAD(*) DETAIL or -DISPLAY THREAD(*) SERVICE(WAIT) with UK02845/UK02846 D A,ALL (or D A,ssnm* or D A,IRL*) D GRS,CONTENTION D OPDATA (If possible, also execute the following 3 DISPLAY cmds) -dis thd(*) service(wait) This will display all threads that have been suspended for 2 times the IRLM timeout limit or a minimum of 60 seconds. If the thread is suspended due to IRLM resource contention or DB2 latch contention, additional information will be displayed to assist in identifying the problem. -DISPLAY DATABASE(*) SPACENAM(*) CLAIMERS LIMIT(*) -DISPLAY DATABASE(*) SPACENAM(*) USE LIMIT(*) -DISPLAY DATABASE(*) SPACENAM(*) LOCKS LIMIT(*) -DISPLAY UTILITY (*) *NOTE* Be sure to keep the MVS SYSLOG to enable reference to the DISPLAY command's DSNV404I response output. *NOTE* A thread STATUS of PT* means that the thread is in DB2 and using Query CP Parallelism. (DEGREE ANY) ---------------------------------------------------------------- 2) Obtain MVS console dumps using the MVS DUMP command: Always include the DB2 subsystem name in your DUMP cmd title. DUMP COMM=(DB2P thread 505 hung) (a) If you suspect that your WAIT or HANG may be due to a LOOP in DB2 on an ALLIED asid, make sure that the MVS INTERNAL SYSTEM TRACE is set ON, and that this trace is set to a good working DIAGNOSTIC value. See II08023. Enter MVS commands: TRACE ST,128K,BR=OFF TRACE MT,264K Note: when dealing with problems of this nature, this internal system TRACE my be the most important diagnostic item captured in the dump. Unless altered, TRACE default is 64K. (b) IF AN ALLIED ADDRESS SPACE IS HUNG: Determine if the relevant ALLIED asid is getting CPU cycles or if the asid is swapped out. - IF THIS ASID IS SWAPPED OUT - Take CONSOLE dump of the MVS *MASTER* (asid 0001). The order is IMPORTANT. The MVS MASTER must be dumped 1st to ensure good dump data. After asid(1) is dumped, then dump ALLIED ASID, IRLM ssnmMSTR, ssnmDBM1 from all members. Try to capture all in one. Use a joblist with wildcards in DUMP cmd to get all members: JOBNAME=(*MASTER*,BADjob,XCFAS,ssnmIRLM,ssnmMSTR,ssnmDBM1), where ssnm = is the subsystemname of the DB2 members. Use REMOTE keyword to gets dumps of all Datasharing members. REMOTE=(JOBLIST,SDATA,DSPNAME),DSPNAME=('ssnmIRLM'.*,'XCFAS'.*) - IF THIS ASID IS NOT SWAPPED OUT - The MVS MASTER (asid 0001) is not required, dump ALLIED, ssnmMSTR, ssnmDBM1, XCFAS, and IRLM asids of all members. c) IF DB2 IS HUNG: Take a CONSOLE dump of ssnmMSTR, ssnmDBM1, IRLM, & XCFAS . SDATA=(RGN,CSA,SQA,LPA,LSQA,SWA,PSA,ALLNUC,XESDATA,TRT,GRSQ,SUM) Use MVS command: D D,OPTIONS to check SDUMP defaults. (if using or considering SLIP, read II10850 and/or PN80921 ) To facilitate the accurate and timely diagnoses of a reported problem, it is imperative that the user produce COMPLETE dumps of the associated malady. PARTIAL dumps will only add confusion, waste valuable time, and usually will be deemed inadequate for full problem diagnoses. Always dump the ssnmMSTR asid along with IRLM or other DB2 asids. A COMPLETE dump with MSTR is MANDATORY. In as such, take note: An IBM 3390 mod3 has 3339 cylinders. DUMPSRV writes the dump in 4160 byte records, 686400 bytes per cylinder. In as such, to write a 500 megabyte dump, requires at least 764 cylinders on this 3390-3 DASD. Most ssnmDBM1 dumps are greater than 500 megabytes. Check your MVS/ESA manual: Planning: Problem Determination and Recovery An SVCDUMP must reside on 1 single DASD, has a DSORG=PS and this non-VSAM dump dataset can have upto 16 extents. So, with MVS/ESA4, it is recommended that user should allocate his DUMPxx datasets with a secondary allocation value set accordingly. Example: SPACE=(CYL,(900,700),RLSE,CONTIG) Check for MSGIEA911E message after the DUMP command is issued The dump may take a minute or so to complete. When finished, MVS will issue the IEA911E message noting the conditon of the dump. The condition will either be COMPLETE or PARTIAL. The message can be MSGIEA611I if dump had been allocated through DYNALLOC. Another message to be aware of is MSGIEA043I MAXSPACE REACHED. This indicates a PARTIAL dump. At minimum, set DB2 using system to a reasonable level, in MVS Commands see these commands: DISPLAY : D D,OPTIONS CHNGDUMP : CD SET,SDUMP,TYPE=XMEME,MAXSPACE=16000M Note: See II06471 : DUMPSRV uses AUX storage for dumping, you may need to add an extra PAGE dataset when dumping DBM1. Note: Allocate a hi-capacity device like a 3390 mod9 for dumps. Note: With DFSMS120 and Dynamic Dump Allocation, multi-volume EFDS format datasets can be created for your SVCDUMP. This DFSMS function is the 'RECOMMENDED' capture method. ---------------------------------------------------------------- 3) RECYCLE DB2: If the CANCEL of a hung thread is not successful, or if DB2 is hung, execute the following commands in the noted order until one of the commands accomodates your need: (ssnm is the DB2 subsystem name) A. -STOP DB2 MODE(QUIESCE) B. -STOP DB2 MODE(FORCE) C. If ssnmDIST is running do MVS command: CANCEL ssnmDIST,A=xx or Modify IRLMPROC with abend using command: F IRLMPROC,ABEND,NODUMP This will tell IRLM to quit, and remove the IDENTIFY between DB2 & IRLM. D. CANCEL ssnmDBM1,A=yy (issue 2 consecutive CANCEL commands) If Cancel ssnmDBM1 does not work then - E. CANCEL ssnmMSTR,A=zz IF cancel ssnmMSTR does not work, then - There is always a FORCE ssnmMSTR,ARM to use as noted earlier but we recommend avoiding its use. IRLM can remain in an indefinite state and you may not be able to restart DB2 before an IPL is done. Use the MVS command FORCE jobname as a LAST resort. This FORCE command may need to be issued several times before the wanted job finally terminates with MSGIEF404I. OEM products like RESOLVE and KILL can be used inlieu of this MVS FORCE command. Use MVS display commands (D A,ALL) to verify that the DB2/IRLM STCjob and ASIDs are no longer active to MVS. ---------------------------------------------------------------- 4) Use the MVS command SETDMN to verify DOMAIN parameters. If MAX and MIN are set TOO low then the DB2 subsystem will not stop and start cleanly, ie: SETDMN MAX=200,MIN=255 ---------------------------------------------------------------- 5) OBTAIN SYS1.LOGREC: Use IFCEREP1 service aid to obtain DETAILed software event records for at least 1 hour prior to the error of note. You may find it beneficial to first run a HISTORY report. Note: If DB2 is FORCEd down, or abends in some way, expect to see MVS CROSS-Memory errors like S0D5 S0D6 S0D7 and S058 S0E0. There may also be several TASK term S13E errors logged in LOGREC. Do NOT interpret any of these secondary recoveries as the source of your DB2 subsystem outage concern. DB2 generated SOFT CANCEL entries like rc00E50013 may also be issued. ---------------------------------------------------------------- 6) If it appears that IRLM is hung, DB2 will most likely be hung along with associated DB2 jobs (threads). It may be necessary to obtain IRLM doc to diagnose the hang. This is especially critical in a DataSharing environment. See II10850 on how to obtain doc from all the members of the data sharing group. Run with IRLM component traces active. At PN90337 or UN98783 specify TRACE=yes in the IRLM startup proc. Otherwise start trace with MVS command: TRACE CT,ON,COMP= irlmnm (see apar pn01040 and the DB2 Commands manual) Issue F irlmproc,STATUS,ALLD or ALLI to see status of members ALL DUMPs can be put on the WEB for Level2 download, read README file at: ftp://testcase.software.IBM.com ( see II11945 ) --------------------------------------------------------------- 02/24/06 rjl Abend522 can occur in an allied address space during a call to DB2 if the processing of the call goes outside the allied task and no other activity occurs in the allied asid for time specified in the JWT parameter in the SMFPRMxx parmlib member and the time limit is not bypassed. . Collect and review logrec, syslog and any dumps taken for the s522. . If the documentation for the s522 shows DB2 csect DSNVSR it indicates the activity under the allied task was suspended and processing is occurring under another task in an address spaces other than the allied asid. This is normal operation. . A slip may be needed to obtain a dump if an error is suspected. SLIP SET,C=522,ID=s522,A=SVCD,J=(name of job), SDATA=(RGN,CSA,SQA,LPA,LSQA,SWA,PSA,ALLNUC,TRT,GRSQ,SUM),END . If the csect name DSNX9WCA is present in one of the logrec enties for the error sequence it indicates a DB2 stored procedure was called. ---------------------------------------------------------------- 8) Check the PSP upgrades for HIPER fixes associated with your DB2 release (UPGRADEs : DB2710 DB2610 DB2510 ) To prevent unexpected DB2 outages caused by a WAIT / HANG / LOOP / SUSPEND, the DB2 SUPPORT TEAM highly recommends that the following APARS be applied to your ESA or OS390 SYSTEM: **** End of Documentation Process **** -------------------- Non DB2/IRLM Fixlist --------------------- See individual apar for ptf required on your system. -----------2002 --------------- OW47911 / UW77371 abend0C4 IGG0CLXE + '6A' or abend878 IGG0CLXA + 09E8 DB2 Utilities get rc00E50013 abend04e . Fix for FMID= HDZ11F0 ---------- older maintenance ------------------------- OY55972 MVS/ESA/420 and above, performance OY65553 Problem caused by MVS DUMP Services OY66146 DDF REQUESTOR hung, SERVER rc00F30072 OY64640 TSO HANG DUE TO TIME LIMIT OW07856 DB2 R410 with DFSMS120 and above improves DB2 shutdown. VSAM MM CONNECT OW12181 DB2 RECOVER OR REORG HANGS OR GETS MSGDSNB224I 0E40 UNIT CHECK OW11968 ACF/VTAM ERR CAUSES PARTIAL DUMPS AND MEMTERM HANGS. OW13090 OW14381 OW14392 Parallel Detach(ENQ SYSZDSN3.DSNYALLI) OW14416 S0C1 S0C4 IN IARFP + X'2092' IN R520 OW17624 CAS error. DB2 STIMER loop DSN3SSI2 OW11787 ESTAE recovery routines skipped OW18235 IEAVESAR errs during RTM process. LCR OW30124 LowCore Refresh IEAVESAR logrec entry OW19900 Archive hung msgief238D WAIT NOHOLD OW23762 DUMPSRV recursive S0C4 pic11 IEAVTSSM OW25038 DUMPSRV recursive ABEND0C4 pic11 II06310 DUMPSRV info plus fixlist (II06226) II05402 CANCEL command fails to complete OW26652 NO DB2 startup see II04773 OW28664 Unpredictable results SLIP PVTMOD OW28828 MSTR enqueue SYSZVOLS (see ow29984) OW32069 OPENMVS DB2 SRBs hung asyncio OW31722 Initiator hung DFSMS130 DSSB BDAM OW30322 Hang Status Stop SRB (see ow31712) OW30546 S0C4 IECVEXCP unpredictable results OW30549 IEAVTSKT err unpredictable results OW31485 UW45995 Agent suspended DSNB5FOR VMM preformat OW32616 OS390 WLM Enclave SRB performance OW32704 Invalid Stack (see ow33986) ABEND073 OW32277 hangs, SRBtime Enclave (see ow33027) OW33628 V2R4 an below RTM, Detach, Cancel failures OW34861 DSNB1CLM loop, abends. Non CMOS device OW38170 hang, s0c1, s0c4... various symptoms OW39670 OS390 V1R3+ ABEND073 rc08 RSM IARUA OW39930 OS390 V2R6+ Hangs, DYNALLOC errs rc0210 rc00C200E2 ------------------------ CA apar GO77614 STIMERM issued from ACFF7SNQ +1D4 hangs DB2 service task PMIOPC01. Contact CA support Wait in csect CASR230D +A6E SRVTSK02 ------------------------------------------------------------- See II10348 for DB2 R410 R510 and IRLM R101 fixlist ---------------------------------------------------------------- PQ24904 UQ28184 EDMpool corrupted hangs, loops, abends PQ25996 UQ30270 Hang, loop Join with Fetch
Local fix
Problem summary
Problem conclusion
Temporary fix
Comments
CLOSED FOR DB2INFO RETENTION: See II04309 for DB2 storage info and more DB2 diagnostic setup.
APAR Information
APAR number
II06335
Reported component name
PB LIB INFO ITE
Reported component ID
INFOPBLIB
Reported release
001
Status
CLOSED CAN
PE
NoPE
HIPER
NoHIPER
Special Attention
NoSpecatt / Xsystem
Submitted date
1992-09-11
Closed date
1995-06-21
Last modified date
2014-08-13
APAR is sysrouted FROM one or more of the following:
APAR is sysrouted TO one or more of the following:
Fix information
Applicable component levels
[{"Business Unit":{"code":null,"label":null},"Product":{"code":"SG19O","label":"APARs - MVS environment"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"","label":""}},{"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SSEPEK","label":"Db2 for z\/OS"},"Component":"","ARM Category":[],"Platform":[{"code":"PF025","label":"Platform Independent"}],"Version":"001","Edition":"","Line of Business":{"code":"LOB10","label":"Data and AI"}}]
Document Information
Modified date:
13 December 2020