IBM i problem analysis

You can use this procedure to find information about a problem with your server hardware when service is managed by the IBM® i operating system.

If you experience a problem with your system or logical partition, try to gather more information about the problem to either solve it, or to help your next level of support or your hardware service provider to solve it more quickly and accurately.

This procedure refers to the IBM i control language (CL) commands that provide a flexible means of entering commands on the IBM i logical partition or system. You can use CL commands to control most of the IBM i functions by entering them from either the character-based interface or System i® Navigator. While the CL commands might be unfamiliar at first, the commands follow a consistent syntax, and IBM i includes many features to help you use them easily. The Programming navigation category in the IBM i Knowledge Center includes a complete CL reference and a CL Finder to look up specific CL commands.

Remember the following points while troubleshooting problems:
  • Has an external power outage or momentary power loss occurred?
  • Has the hardware configuration changed?
  • Has system software been added?
  • Have any new programs or program updates (including PTFs) been installed recently?
To make sure that your IBM software has been correctly installed, use the Check Product Option (CHKPRDOPT) command.
  • Have any system values changed?
  • Has any system tuning been done?

Before you use this procedure, ensure that you performed the steps in Beginning problem analysis.

After reviewing these considerations, follow these steps:

  1. Is the IBM i operating system up and running?
    • Yes: Continue with the next step.
    • No: Go to step 20.
  2. Are you troubleshooting a problem related to the System i integration with BladeCenter and System x within an internet small computer system interface (iSCSI) environment?
  3. Are you experiencing problems with the Operations Console?
  4. Does the console show a Main Storage Dump Manager display?
  5. Is the console that was in use when the problem occurred (or any console) operational?
    Note: The console is operational if a sign-on display or a command line is present. If another console is operational, use it to resolve the problem.
  6. Is a message related to this problem shown on the console?
    • Yes: Continue with the next step.
    • No: Go to step 11.
  7. Is this a system operator message?
    Note: It is a system operator message if the display indicates that the message is in the QSYSOPR message queue. Critical messages can be found in the QSYSMSG message queue. For more information, see the Create message queue QSYSMSG for severe messages topic in the Troubleshooting navigation category of the IBM i Knowledge Center.
    • Yes: Continue with the next step.
    • No: Go to step 9.
  8. Is the system operator message highlighted, or does it have an asterisk (*) next to it?
    • Yes: Go to step 18.
    • No: Go to step 13.
  9. Move the cursor to the message line and press F1 (Help). Does the Additional Message Information display appear?
    • Yes: Continue with the next step.
    • No: Go to step 11.
  10. Record the additional message information on the appropriate problem reporting form. For details, see Problem reporting form.

    Follow the recovery instructions on the Additional Message Information display.

    Did this solve the problem?
    • Yes: This ends the procedure.
    • No: Continue with the next step.
  11. To display system operator messages, type dspmsg qsysopr on any command line and then press Enter.
    Did you find a message that is highlighted or has an asterisk (*) next to it?
    • Yes: Go to step 18.
    • No: Continue with the next step.
    Note: The message monitor in System i Navigator can also inform you when a problem has developed. For details, see the Scenario: Message monitor topic in the Systems Management navigation category of the IBM i Knowledge Center.
  12. Did you find a message with a date or time that is at or near the time the problem occurred?
    Note: Move the cursor to the message line and press F1 (Help) to determine the time a message occurred. If the problem is shown to affect only one console, you might be able to use information from the JOB menu to diagnose and solve the problem. To find this menu, type GO JOB and press Enter on any command line.
    • Yes: Continue with the next step.
    • No: Go to step 15.
  13. Perform the following steps:
    1. Move the cursor to the message line and press F1 (Help) to display additional information about the message.
    2. Record the additional message information on the appropriate problem reporting form. For details, see Problem reporting form.
    3. Follow any recovery instructions that are shown.

    Did this solve the problem?

    • Yes: This ends the procedure.
    • No: Continue with the next step.
  14. Did the message information indicate to look for additional messages in the system operator message queue (QSYSOPR)?
    • Yes: Press F12 (Cancel) to return to the list of messages and look for other related messages. Then return to step 11.
    • No: Continue with the next step.
  15. Do you know which input/output device is causing the problem?
    • Yes: Continue with step 17.
    • No: Continue with the next step.
  16. If you do not know which input/output device is causing the problem, describe the problems that you have observed by performing the following steps:
    1. Type GO USERHELP on any command line and then press Enter.
    2. Select option 10 (Save information to help resolve a problem).
    3. Type a brief description of the problem and then press Enter. If you specify the default Y in the Enter notes about problem field, you can enter more text to describe your problem.
    4. Report the problem to your hardware service provider.
  17. Perform the following steps:
    1. Type ANZPRB on the command line and then press Enter. For details, see Using the Analyze Problem (ANZPRB) command in the Troubleshooting navigation category in the IBM i Knowledge Center.
    2. Contact your next level of support. This ends the procedure.
    Note: To describe your problem in greater detail, see Using the Analyze Problem (ANZPRB) command in the Troubleshooting navigation category in the IBM i Knowledge Center. This command can also run a test to further isolate the problem.
  18. Perform the following steps:
    1. Move the cursor to the message line and press F1 (Help) to display additional information about the message.
    2. Press F14, or use the Work with Problem (WRKPRB) command. For details, see Work with Problem (WRKPRB) in the Troubleshooting navigation category in the IBM i Knowledge Center.
    3. If this does not solve the problem, see the Symptom and recovery actions.
  19. Choose from the following options:
    • If there are reference codes appearing on the control panel or the management console, record them. Then go to the Reference code finder to see if there are additional details available for the code you received.
    • If there are no reference codes appearing on the control panel or the management console, a serviceable event is indicated by a message in the problem log. Use the WRKPRB command. For details, see Work with Problem (WRKPRB) in the Troubleshooting navigation category in the IBM i Knowledge Center.
  20. Details about errors that occur when IBM i is not running or when IBM i is now not accessible can be found in the control panel or in the Advanced System Management Interface (ASMI).

    Do you choose to look for error details using ASMI?

    • Yes: Go to step 22.
    • No: Continue with the next step.
  21. At the control panel, complete the following steps.
    1. Press the increment or decrement button until the number 11 is displayed in the upper-left corner of the display.
    2. Press Enter to display the contents of function 11.
    3. Look for a reference code in the upper-right corner.

    Is there a reference code displayed on the control panel in function 11?

    • Yes: Go to step 23.
    • No: Contact your hardware service provider. This ends the procedure.
  22. On the console connected to the ASMI, complete the following steps.
    Note: If you are unable to locate the reported problem, and there is more than one open problem near the time of the reported failure, use the earliest problem in the log.
    1. Log in with a user ID that has an authority level as general, administrator, or authorized service provider.
    2. In the navigation area, expand System Service Aids and click Error/Event Logs. If log entries exist, a list of error and event log entries is displayed in a summary view.
    3. Scroll through the log under Serviceable Customer Attention Events and verify that there is a problem to correspond with the failure.

    For more detailed information on the ASMI, see Managing the Advanced System Management Interface.

    Do you find a serviceable event, or an open problem near the time of the failure?

    • Yes: Continue with the next step.
    • No: Contact your hardware service provider. This ends the procedure.
  23. The reference code description might provide information or an action that you can take to correct the failure.
    Use the search function of IBM Knowledge Center to find the reference code details. The search function is located in the upper-left corner of IBM Knowledge Center. Read the reference code description and return here. Do not take any other action at this time.

    For more information about reference codes, see Reference codes.

    Was there a reference code description that enabled you to resolve the problem?

    • Yes: This ends the procedure.
    • No: Continue with the next step.
  24. Service is required to resolve the error. Collect as much error data as possible and record it. You and your service provider will develop a corrective action to resolve the problem based on the following guidelines:
    • If a field-replaceable unit (FRU) location code is provided in the serviceable event view or control panel, use that location to determine which FRU to replace.
    • If an isolation procedure is listed for the reference code in the reference code lookup information, include the isolation procedure as a corrective action even if it is not listed in the serviceable event view or control panel.
    • If any FRUs are marked for block replacement, replace all FRUs in the block replacement group at the same time.

    To find error details on the control panel, complete the following steps:

    1. Press Enter to display the contents of function 14. If data is available in function 14, the reference code has a FRU list.
    2. Record the information in functions 11 through 20 on the control panel.
    3. Contact your service provider and report the reference code and other information.

    To find error details on the ASMI, complete the following steps from the Error Event Log view:

    1. Record the reference code.
    2. Select the corresponding check box on the log and click Show details.
    3. Record the error details.
    4. Contact your service provider.

    This ends the procedure.




Last updated: Tue, October 17, 2017