subscribe iconSubscribe to this information

AIX fast-path problem isolation

Use this information to help you isolate a hardware problem and the server is running the AIX® operating system.

In most cases, AIX diagnostics are performed through automatic error log analysis. In some cases, these procedures direct you to run online diagnostics. Standalone diagnostics should only be used if you are unable to boot AIX or are otherwise specifically directed to do so.
Notes:
  1. If the server or partition has an external SCSI disk drive enclosure attached and you have not been able to find a reference code or other symptom, go to MAP 2010: 7031-D24 or 7031-T24 START.
  2. If you are servicing an SP system, go to the Start of Call MAP 100 in the SP System Service Guide.
  3. If you are servicing a clustered server, go to the Start of Call MAP 100 in the Clustered Installation and Service Guide.
  4. If you are servicing a clustered server that has InfiniBand switch networks, go to referenceinfiniband.htm and open the book entitled Guide to Clustering systems using InfiniBand hardware.
Note: If you already know the reference code or have another symptom other than a reference code, go directly to the AIX fast path table.

Use the following procedure to display or confirm a previously reported reference code including an SRN.

  1. Log into the AIX operating system as the root user, or use the CE login. If you need assistance, contact the system operator.
  2. Enter the diag command. The diag command allows you to load the diagnostic controller and display the online diagnostic menus.
  3. Press Enter. This opens the FUNCTION SELECTION menu.
  4. Select Task Selection.
  5. Select Display Previous Diagnostic Results.
  6. Select DISPLAY DIAGNOSTIC LOG SUMMARY. A display diagnostic log summary table is shown with a time ordered table of events from the error log.
  7. Look for the most recent S entry in the T column. The most recent S entry is the one closest to the beginning of the DISPLAY DIAGNOSTIC LOG SUMMARY table.
  8. Move your cursor over the row containing the S entry and press Enter.
  9. Press F7 to Commit.

    A screen containing details from the table is displayed; look for the reference code (SRN or SRC) entry. The SRN or SRC entry is shown near the bottom of the screen.

  10. Record the reference code.

    The following example, which shows the details of an SRN, is similar to what you should see on your terminal when you perform the above procedure.

    DISPLAY DIAGNOSTIC LOG                                                    802004
    [TOP]
    --------------------------------------------------------------------------------
    IDENTIFIER:             DAFE
    
    Date/Time:              Fri Aug 27 17:57:54
    Sequence Number:        952
    Event type:             SRN Callout
    
    Resource Name:          ent1
    Resource Description:   Gigabit Ethernet-SX Adapter (e414a816)
    Location:               U8842.P1Z.23A0781-P1-T7
    
    Diag Session:           21546
    Test Mode:              No Console,Non-Advanced,Normal IPL,ELA,Option Checkout
    
    Error Log Sequence Number:      2189
    Error Log Identifier:           6363CE4F
    
    SRN:                    25C4-601
    
    Description:            Download Firmware Error.
    
    Probable FRUs:
        ent1             FRU: BCM95704A41          U8842.P1Z.23A0781-P1-T7
                         Gigabit Ethernet-SX Adapter (e414a816)
    
    --------------------------------------------------------------------------------
    [BOTTOM]
    Use Enter to continue.
    
    Esc+3=Cancel        Esc+0=Exit          Enter
  11. If any reference codes are displayed, record all information provided from the diagnostic results and go to Reference codes.

    OR

    If a no trouble found is displayed continue to the next step.

  12. When your results are complete, press F3 to return to the Diagnostic Operating Instructions display.
  13. Press Ctrl + D to log off from being either the root user or CE login user.

AIX fast path table

Locate the problem in the following table and perform the action indicated.

Symptoms Action
Eight-Digit Error Codes
You have an eight-digit error code. Go to Reference codes, read the notes on the first page, and do the listed action for the eight-digit error code.
Note: If the repair for this code does not involve replacing a FRU (for instance, if you run an AIX command that fixes the problem or if you change a hot-pluggable FRU), then run the Log Repair Action option on resource sysplanar0 from the Task Selection menu under online diagnostics after the problem is resolved to update the AIX error log.
SRNs
You have an SRN. Look up the SRN in the List of service request numbers and do the listed action.
Note: Customer-provided SRNs should be verified. To verify the SRN use the Display Previous Diagnostic Results Service Aid. Choose the Display Diagnostic Log Summary when running this service aid.
An SRN is displayed when running diagnostics.
  1. Record the SRN and location code.
  2. Look up the SRN in the List of service request numbers and do the listed action.
   
888 Sequence in Operator Panel Display
An 888 sequence in the operator panel display. Go to MAP 0070: 888 Sequence in operator panel display.
The System Stops or Hangs With a Value Displayed in the Operator Panel Display
The system stopped with a 4-digit code that begins with a 2 (two) displayed in the operator panel display. Record SRN 101-xxxx (where xxxx is the four digits of code displayed). The physical location code or device name displays on system units with a multiple-line operator panel display. If a physical location code or an AIX location code is displayed, record it, then look up the SRN in the List of service request numbers and do the listed action.
The system stopped with a 3-digit code operator panel display. Record SRN 101-xxx (where xxx is the three digits of the code displayed). Look up the SRN in the List of service request numbers and do the listed action.
System Automatically Reboots
System automatically reboots.
  1. Turn off the system unit power.
  2. Turn on the system unit power and boot from a removable media device, disk, or LAN in service mode.
  3. Run the diagnostics in problem determination mode.
  4. Select the All Resources option from the Resource Selection menu to test all resources.
  5. If an SRN displays, look up the SRN in the List of service request numbers and do the listed action.
  6. If an SRN is not displayed, suspect a power supply or power source problem.
System does not Reboot When Reset Button is Pushed
System does not reboot (reset) when the reset button is pushed. Record SRN 111-999. Look up the SRN in the List of service request numbers and do the listed action.
ASYNC Communication Problems
You suspect an async communication problem.
  1. Run the advanced async diagnostics on the ports on which you are having problems. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. If you suspect a problem with the async concentrator, remote async node, and so on, refer to the documentation in RS/6000® eServer™ pSeries® Adapters, Devices, and Cable Information for Multiple Bus Systems on these devices and perform any tests or checks listed.
SCSI Adapter Problems
You suspect a SCSI adapter problem.

SCSI adapter diagnostics can only be run on a SCSI adapter that was not used for booting. The POST tests any SCSI adapter before attempting to use it for booting. If the system was able to boot using a SCSI adapter, then the adapter is most likely good.

SCSI adapters problems are also logged into the error log and are analyzed when the online SCSI diagnostics are run in problem determination mode. Problems are reported if the number of errors is above defined thresholds.

  1. Run the online SCSI adapter diagnostic in problem determination mode. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. Use MAP 0050: SCSI bus problems.
    Note: If you cannot load diagnostics (standalone or online) go to PFW1540: Problem isolation procedures.
SCSI Bus Problems
You suspect a SCSI bus problem.
  1. Use MAP 0050: SCSI bus problems.
  2. Use the SCSI Bus Service Aid to exercise and test the SCSI Bus.
Tape Drive Problems
You suspect a tape drive problem.
  1. Refer to the tape drive documentation and clean the tape drive.
  2. Refer to the tape drive documentation and do any listed problem determination procedures.
  3. Run the online advanced tape diagnostics in problem determination mode. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  4. Use the Backup and restore service aid to exercise and test the drive and media.
  5. Use MAP 0050: SCSI bus problems.
  6. Use the SCSI bus service aid to exercise and test the SCSI bus.
  7. Refer to the device section of RS/6000 eServer pSeries Adapters, Devices, and Cable Information for Multiple Bus Systems for additional information and MAP 0020: Problem determination procedure for problem determination procedures.
Note: Information on tape cleaning and tape-problem determination can be found in Tape unit isolation procedures.
Optical Drive Problems
You suspect a optical drive problem.
  1. Perform the problem determination procedures in the optical drive documentation.
  2. Before servicing a optical drive ensure that it is not in use and that the power connector is correctly attached to the drive. If the load or unload operation does not function, replace the optical drive.
  3. Run the online advanced optical diagnostics in problem determination mode. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  4. If the problem is with a SCSI optical drive, use MAP 0050: SCSI bus problems.
  5. If the problem is with a SCSI optical drive, use the SCSI bus service aid to exercise and test the SCSI bus.
  6. Refer to the device section of RS/6000 eServer pSeries Adapters, Devices, and Cable Information for Multiple Bus Systems for additional information and MAP 0020: Problem determination procedure for problem determination procedures.
SCSI Disk Drive Problems
You suspect a disk drive problem.

Disk problems are logged in the error log and are analyzed when the online disk diagnostics are run in problem determination mode. Problems are reported if the number of errors is above defined thresholds.

If the diagnostics are booted from a disk, then the diagnostics can only be run on those drives that are not part of the root volume group. However, error log analysis is run if these drives are selected. To run the disk diagnostic tests on disks that are part of the root volume group, the standalone diagnostics must be used.

  1. Run the online advanced disk diagnostics in problem determination mode. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. Run standalone disk diagnostics. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  3. Use the certify disk service aid to verify that the disk can be read.
  4. Use MAP 0050: SCSI bus problems.
  5. Use the SCSI bus service aid to exercise and test the SCSI Bus.
  6. Refer to the device section of RS/6000 eServer pSeries Adapters, Devices, and Cable Information for Multiple Bus Systems for additional information and MAP 0020: Problem determination procedure for problem determination procedures.
Identify LED does not function on the drive plugged into the SES or SAF-TE backplane. Use the "identify a device attached to a SES device" service aid listed under SCSI and SCSI RAID Hot-Plug Manager on the suspect drive LED. If the drive LED does not blink when put into the identify state, use FFC 2D00 and SRN source code "B" and go to MAP 0210: General problem resolution.
Activity LED does not function on the drive plugged into the SES or SAF-TE backplane. Use the certify media service aid (see certify media) on the drive in the slot containing the suspect activity LED. If the activity LED does not intermittently blink when running certify, use FFC 2D00 and SRN source code "B" and go to MAP 0210: General problem resolution.
Diskette Drive Problems
You suspect a diskette drive problem.
  1. Run the diskette drive diagnostics. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. Use the diskette media service aid to test the diskette media.
  3. Use the backup/restore media service aid to exercise and test the drive and media.
Token-Ring Problems
You suspect a token-ring adapter or network problem.
  1. Run the online advanced token-ring diagnostics in problem determination mode. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. Use the ping command to exercise and test the network.
  3. Refer to MAP 0020: Problem determination procedure for additional information and problem determination procedures.
Ethernet Problems
You suspect an Ethernet adapter or network problem.
  1. Run the online advanced Ethernet diagnostics in problem determination mode. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. Use the ping command to exercise and test the network.
  3. Refer to MAP 0020: Problem determination procedure. for additional information and problem determination procedures.
Display Problems
You suspect a display problem.
  1. If your display is connected to a KVM switch, go to Troubleshooting the keyboard, video, and mouse (KVM) switch for the 1x8 and 2x8 console manager. If you are still having display problems after performing the KVM switch procedures, come back here and continue with step 2.
  2. If you are using the Hardware Management Console, go to the Managing your server using the Hardware Management Console section.
  3. If you are using a graphics display:
    1. Go to the problem determination procedures for the display.
    2. If you do not find a problem:
Keyboard or Mouse
You suspect a keyboard or mouse problem.

If your keyboard is connected to a KVM switch, go to Troubleshooting the keyboard, video, and mouse (KVM) switch for the 1x8 and 2x8 console manager. If you are still having keyboard problems after performing the KVM switch procedures, come back here and continue to the next paragraph.

Run the device diagnostics. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.

If you are unable to run diagnostics because the system does not respond to the keyboard, replace the keyboard or system planar.

Note: If the problem is with the keyboard it could be caused by the mouse device. To check, unplug the mouse and then recheck the keyboard. If the keyboard works, replace the mouse.
Printer and TTY Problems
You suspect a TTY terminal or printer problem.
  1. Go to problem determination procedures for the printer or terminal.
  2. Check the port that the device is attached to by running diagnostics on the port. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  3. Use the "Testing the Line Printer" procedure in General diagnostic information to test the connection to the printer. If a problem exists, replace the following in the order listed:
    1. Device cable
    2. Port to which the printer or terminal is connected.
Other Adapter Problems
You suspect a problem on another adapter that is not listed above.
  1. Run the online advanced diagnostics in problem determination on the adapter you suspect. If an SRN is displayed, look up the SRN in the List of service request numbers and do the listed action.
  2. Refer to MAP 0020: Problem determination procedure. for additional information and problem determination procedures.
System Messages
A system message is displayed.
  1. If the message describes the cause of the problem, attempt to correct it.
  2. Look for another symptom to use.
Processor and Memory Problems
You suspect a memory problem.

Memory tests are only done during POST. Only problems that prevent the system from booting are reported during POST. All other problems are logged and analyzed when the sysplanar0 option under the advanced diagnostics selection menu is run.

System crashes are logged in the AIX error log. The sysplanar0 option under the advanced diagnostic selection menu is run in problem determination mode to analyze the error.

  1. Power off the system.
  2. Turn on the system unit power and load the online diagnostics in service mode.
  3. Run either the sysplanar0 or the Memory option under the advanced diagnostics in problem determination mode.
  4. If an SRN is displayed, record the SRN and location code.
  5. Look up the SRN in the List of service request numbers and do the listed action.
Degraded Performance or Installed Memory Mismatch
Degraded performance or installed memory mismatch Degraded performance can be caused by memory problems that cause a reduction in the size of available memory. To verify that the system detected the full complement of installed memory do the following:
  1. From the task selection menu select the Display Resource Attribute.
  2. From the resource selection menu select one of the listed memory resources.
  3. Verify the amount of memory listed matches the amount actually installed.
  4. Use the service processor (ASMI) menus to see if the memory has been removed (garded out of) the system's configuration by the system or an administrator.
Missing Resources
Missing resources

Use the Display Configuration and Resource List or Vital Product Data (VPD) Service Aid to verify that the resource was configured.

If an installed resource does not appear, check that it is installed correctly. If you do not find a problem, go to MAP 0020: Problem determination procedure.

Missing Path on MPIO Resource
Missing path on MPIO resource

If a path is missing on an MPIO resource, shown as the letter P in front of the resource in the resource listing, go to MAP 0020: Problem determination procedure.

System Hangs or Loops When Running the OS or Diagnostics
The system hangs in the same application. Suspect the application. To check the system:
  1. Power off the system.
  2. Turn on the system unit power and load the online diagnostics in service mode.
  3. Select the All Resources option from the resource selection menu to test all resources.
  4. If an SRN is displayed at anytime, record the SRN and location code.
  5. Look up the SRN in the List of service request numbers and do the listed action.
The system hangs in various applications.
  1. Power off the system.
  2. Turn on system unit power and load the online diagnostics in service mode.
  3. Select the All Resources option from the resource selection menu to test all resources.
  4. If an SRN is displayed at anytime, record the SRN and location code.
  5. Look up the SRN in the List of service request numbers and do the listed action.
The system hangs when running diagnostics. Replace the resource that is being tested.
You Cannot Find the Symptom in This Table
All other problems. Go to MAP 0020: Problem determination procedure.
Exchanged FRUs Did Not Fix the Problem
A FRU or FRUs you exchanged did not fix the problem. Go to MAP 0020: Problem determination procedure.
RAID Problems
You suspect a problem with a RAID. A potential problem with a RAID adapter exists. Run diagnostics on the RAID adapter. Refer to the RAID Adapters User's Guide and Maintenance Information or the service guide for the RAID.

If the RAID adapter is a PCI-X RAID adapter, refer to the PCI-X SCSI RAID Controller Reference Guide for AIX.

System Date and Time Problems
  • The system does not retain the calendar date after the system has been booted.
  • The system does not retain the time of day after the system has been booted.
  1. Run the sysplanar0 option under the advanced diagnostics in problem determination mode. If an SRN is reported, record the SRN and location code information. Look up the SRN in the List of service request numbers and do the listed action.
  2. Replace the TOD (NVRAM) battery. If this does not fix the problem, replace the service processor; its location is model-dependent.
SSA Problems
You suspect an SSA problem. A potential problem with an SSA adapter exists. Run the SSA service aid. To perform a service aid see AIX service aids and follow the instructions.
Power Indicator Light is Not On
A drawer power indicator is not on. Return to Start of call procedure.
System Power Problem
The system does not power on. Return to Start of call procedure.
The system powers on when it should not. Return to Start of call procedure.

Send feedback | Rate this page

Last updated: Fri, Oct 30, 2009