Troubleshooting
Problem
This note defines a procedure to use when a job will not end and the Power Down System (PWRDWNSYS) command is required to end the job and the subsystem.
Resolving The Problem
This procedure can be used when a job will not end and the Power Down System (PWRDWNSYS) command is required to end the job and the subsystem. The system will take a Main Store Stand-alone Dump (MSSD) and save it to DASD if:
| o | A B9003F10 SRC is to be displayed on the Control Panel. |
| o | The system has the ability to write to the Source DASD. |
| o | The MSSD Auto Copy Facility has been enabled using SST. |
The MSSD Auto Copy Facility may be enabled using STRSST accordingly:
| 1. | STRSST - Start System Service Tool |
| 2. | Option 6, Main storage dump manager |
| 3. | Option 3, Work with auto copy options |
| 4. | Auto copy and reIPL . . . . 1 1=Yes 2=No Subset if possible . . . . 1 1=Yes 2=No, if enough room for full copy 3=No ASP number . . . . . . . . 1 (1-32) |
If a job appears to be hung, you should do the following steps to end the job rather than immediately ending the subsystem in which it is running.
| Step 1 | Determine if the job is doing a Commit rollback. Do a WRKACTJOB, locate the job and determine if its function contains *ROLLBACK. Opt Subsystem/Job User Type CPU % Function Status __ QGYSERVER QUSER BCI .0 * -ROLLBACK END If the job is in a rollback, do not end the job. The job is decommitting a Data Base update. The job cannot be ended until the rollback completes. If an IPL is performed while the job is in rollback, it will cause a long IPL as the rollback must complete during the IPL. |
| Step 2 | End the job using the immediate option: ENDJOB OPTION(*IMMED) |
| Step 3 | If the immediate option does not work, wait about 10 minutes, do ENDJOBABN. |
| Step 4 | If ENDJOBABN does not work, we have a condition that can only be corrected via Power Down System. Before this is done, ensure that the MSSD Auto Copy Facility is enabled. Refer to the procedures above. Note that ENDJOBABN command always generates a VLOG process dump if it fails. |
| 1. | End subsystems QSYSWRK and QSERVER correctly. Jobs in these subsystems should be ended in the following order: a) End IBM AnyNet using the following command: CHGNETA ALWANYNET(*NO) b) End all Host Servers using the following command: ENDHOSTSVR SERVER(*ALL) c) End TCP/IP using the following command: ENDTCP OPTION(*IMMED) |
| 2. | End all other subsystems with the *IMMED option. Note: If the *IMMED option is used. you will have to go to the console (System Value QCONSOLE) because your job will also end. ENDSBS SBS(*ALL) OPTION(*IMMED) This may free up whatever condition that is causing the job to hang. If this frees up the condition and the system goes to the restricted state, the PWRDWNSYS command is not required. Rather, start the controlling subsystem. See system value QCTLSBSD. |
| 3. | Note: If the prior step fails, do this step. Do a WRKSYSVAL QPWRDWNLMT. Record the existing value and change it to 60 seconds. This will prevent the system from waiting extra time before B9003F10 and MSSD logic is triggered. |
| 4. | Do one more ENDJOBABN for the job that will not end. This will force the job and its structures into main storage. This will also force any seize conflict conditions into main storage. If ENDJOBABN does not work, you will have to PWRDWNSYS. |
| 5. | Before you do a PWRDWNSYS make sure the MSSD Auto Copy Facility is enabled. See step above. Once enabled, it will remain enabled until it is reset. Now it is time to power down the system. ENDJOBABN failed, System value QPWRDWNLMT is set to 60 seconds and the MSSD Auto Copy Facility is enabled. Issue the following PWRDWNSYS command if you want the system to take a MSSD and re-IPL without manual intervention should a B9003F10 SRC occur. PWRDWNSYS OPTION(*IMMED) RESTART(*YES) + ENDSBSOPT(*DFT) TIMOUTOPT(*MSD) |
| 6. | When the system is IPLed again, do the following: a) DSPLOG MSGID(CPI091D CPF0918) - If CPI091D exists with a reason code of 7, a MSSD dump was taken. - If message(s) CPF0918 exist, it identifies the subsystems which did not end within the time limit set in QPWRDWNLMT. b) Reset System Value QPWRDWNLMT to the value it was set to before it was changed. c) If a MSSD was taken, call the IBM Support Line. |
[{"Type":"MASTER","Line of Business":{"code":"LOB68","label":"Power HW"},"Business Unit":{"code":"BU070","label":"IBM Infrastructure"},"Product":{"code":"SWG60","label":"IBM i"},"ARM Category":[{"code":"a8m0z0000000CHAAA2","label":"Operating System"}],"ARM Case Number":"","Platform":[{"code":"PF012","label":"IBM i"}],"Version":"All Versions"}]
Historical Number
14481253
Was this topic helpful?
Document Information
Modified date:
04 October 2024
UID
nas8N1018175