PFW1548: Memory and processor subsystem problem isolation procedure

Use this problem isolation procedure to aid in solving memory and processor problems that are not found by normal diagnostics.

Notes:
  1. To avoid damage to the system or subsystem components, unplug the power cords before removing or installing any part.
  2. This procedure assumes that either:
    • An optical drive is installed and connected to the integrated EIDE adapter, and a stand-alone diagnostic CD-ROM is available.

      OR

    • Stand-alone diagnostics can be booted from a NIM server.
  3. If a power-on password or privileged-access password is set, you are prompted to enter the password before the stand-alone diagnostic CD-ROM can load.
  4. The term POST indicators refers to the device mnemonics that appear during the power-on self-test (POST).
  5. The service processor might have been set by the user to monitor system operations and to attempt recoveries. You might want to disable these options while you diagnose and service the system. If these settings are disabled, make notes of their current settings so that they can be restored before the system is turned back over to the customer. For 9080-HEX systems, the following settings may be of interest.
    Monitoring
    (also called surveillance) From the ASMI menu, expand the System Configuration menu, then click Monitoring. Disable both types of surveillance.
    Auto power restart
    (also called unattended start mode) From the ASMI menu, expand Power/Restart Control, then click Auto Power Restart, and set it to disabled.
    Wake on LAN
    From the ASMI menu, expand Wake on LAN, and set it to disabled.
    Call Out
    From the ASMI menu, expand the Service Aids menu, then click Call-Home/Call-In Setup. Set the call-home system port and the call-in system port to disabled.
  6. For 9080-HEX systems, verify that the system has not been set to boot to the System Management Services (SMS) menus or to the open firmware prompt. From the ASMI menu, expand Power/Restart Control to view the menu, then click Power On/Off System. The AIX/Linux partition mode boot should say "Continue to Operating System".
  7. The service processor might have recorded one or more symptoms in its error/event log. Use the Advanced System Management Interface (ASMI) menus to view the error/event log.
    • Look for a possible new error that occurred during power on of the system. If there is a new error, and its actions call for a FRU replacement, perform those actions. If this does not resolve the problem, go to PFW1548-1.
    • If powering on the system did not yield a new error code, look at the error that occurred just before the original error. Perform the actions associated with that error. If this does not resolve the problem, go to PFW1548-1.
    • If powering on the system results in the same error code, and there are no error codes before the original error code, go to PFW1548-1.

Perform the following procedure:

  • PFW1548-1
    1. Ensure that the diagnostics and the operating system are shut down.

      Is the system at "service processor standby", indicated by 01 in the control panel?

      No
      Replace the system backplane. Return to the beginning of this step.
      Yes
      Continue with substep 2.
    2. Turn on the power using either the white button or the ASMI menus.

      If an HMC is attached, does the system reach hypervisor standby as indicated on the management console? If a management console is not attached, does the system reach an operating system login prompt, or if booting the stand-alone diagnostic CD-ROM, is the Please define the System Console screen displayed?

      No
      Go to PFW1548-3.
      Yes
      Go to PFW1548-2.
    3. Insert the stand-alone diagnostic CD-ROM into the optical drive.
      Note: If you cannot insert the diagnostic CD-ROM, go to PFW1548-2.
    4. When the word keyboard is displayed on an ASCII terminal, a directly attached keyboard, or management console, press the number 5 key.
    5. If you are prompted to do so, enter the appropriate password.

      Is the "Please define the System Console" screen displayed?

      No
      Go to PFW1548-2.
      Yes
      Go to PFW1548-14.
  • PFW1548-2

    Insert the stand-alone diagnostic CD-ROM into the optical drive.

    Note: If you cannot insert the stand-alone diagnostic CD-ROM, go to step PFW1548-3.

    Turn on the power using either the white button or the ASMI menus. (If the stand-alone diagnostic CD-ROM is not in the optical drive, insert it now.) If a management console is attached, after the system has reached hypervisor standby, activate a Linux® or AIX® partition by clicking the Advanced button on the activation screen. On the Advanced activation screen, select Boot in service mode using the default boot list to boot the stand-alone diagnostic CD-ROM.

    If you are prompted to do so, enter the appropriate password.

    Is the "Please define the System Console" screen displayed?

    No
    Go to PFW1548-3.
    Yes
    Go to PFW1548-14.
  • PFW1548-3
    1. Turn off the power.
    2. If you have not already done so, configure the service processor (using the ASMI menus) with the instructions in note 6 at the beginning of this procedure, then return here and continue.
    3. Exit the service processor (ASMI) menus and remove the power cords.
    4. Disconnect all external cables (parallel, system port 1, system port 2, keyboard, mouse, USB devices, SPCN, Ethernet, and so on). Also disconnect all of the external cables attached to the service processor except the Ethernet cable going to the management console, if a management console is attached.
    Go to the next step.
  • PFW1548-4
    Perform the following steps:
    1. Place the drawer into the service position and remove the service access cover.
    2. Record the slot numbers of the PCI adapters and I/O expansion cards if present. Label and record the locations of all cables attached to the adapters. Disconnect all cables attached to the adapters and remove all of the adapters.
    3. Slide the media or disk drive enclosure out approximately three centimeters.
    4. Remove and label the disk drives from the media or disk drive enclosure assembly.
    5. Remove all memory DIMMs except for one pair.
    6. Plug in the power cords and wait for 01 in the upper-left corner of the control panel display.
    7. Turn on the power using either the management console or the white button.
  • PFW1548-5

    Were any memory DIMMs removed from system backplane?

    No
    Go to PFW1548-8.
    Yes
    Go to the next step.
  • PFW1548-6
    1. Turn off the power, and remove the power cords.
    2. Replug the memory DIMMs that were removed in PFW1548-4 in their original locations.
    3. Plug in the power cords and wait for 01 in the upper-left corner of the control panel display.
    4. Turn on the power using either the management console or the white button.

      If a management console is attached, does the managed system reach power on at hypervisor standby as indicated on the management console? If a management console is not attached, does the system reach an operating system login prompt, or if booting the stand-alone diagnostic CD-ROM, is the Please define the System Console screen displayed?

      No:

      A memory DIMM in the pair you just replaced in the system is defective. Turn off the power, remove the power cords, and exchange the memory DIMM pair with new or previously removed memory DIMM pair. Repeat this step until the defective memory DIMM pair is identified, or all memory DIMM pairs have been replaced.

      If your symptom did not change and all the memory DIMM pairs have been exchanged, call your service support person for assistance. If the symptom changed, check for loose cards and obvious problems.

      If you do not find a problem, go to Problem Analysis and follow the instructions for the new symptom.

      Yes:
      Go to PFW1548-7.1.
  • PFW1548-7.1
    No failure was detected with this configuration.
    1. Turn off the power and remove the power cords.
    2. Reinstall the next DIMM pair.
    3. Plug in the power cords and wait for 01 in the upper-left corner of the control panel display.
    4. Turn on the power using either the management console or the white button.
      If a management console is attached, does the managed system reach power on at hypervisor standby as indicated on the management console? If a management console is not attached, does the system reach an operating system login prompt, or if booting the stand-alone diagnostic CD-ROM, is the Please define the System Console screen displayed?
      No:
      One of the FRUs remaining in the system is defective. Exchange the FRUs (that have not already been changed) in the following order:
      1. Memory DIMMs (if present). Exchange the DIMM pairs, one at a time, with new or previously removed DIMM pairs.
      2. System backplane
      3. Power supplies
      4. Processor modules

      Repeat the FRU replacement steps until the defective FRU is identified or all the FRUs have been exchanged.

      If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

      If the symptom has changed, check for loose cards, cables, and obvious problems. If you do not find a problem, go to Problem Analysis and follow the instructions for the new symptom.

      Yes:
      If all of the processor cards have been reinstalled, go to step PFW1548-8. Otherwise, repeat this step.
  • PFW1548-8
    1. Turn off the power.
    2. Reconnect the system console.
      Notes:
      1. If an ASCII terminal has been defined as the firmware console, attach the ASCII terminal cable to the S1 connector on the rear of the system unit.
      2. If a display attached to a display adapter has been defined as the firmware console, install the display adapter and connect the display to the adapter. Plug the keyboard and mouse into the keyboard connector on the rear of the system unit.
    3. Turn on the power using either the management console or the white button. (If the stand-alone diagnostic CD-ROM is not in the optical drive, insert it now.) If a management console is attached, after the system has reached hypervisor standby, activate a Linux or AIX partition by clicking the Advanced button on the activation screen. On the Advanced activation screen, select Boot in service mode using the default boot list to boot the stand-alone diagnostic CD-ROM.
    4. If the ASCII terminal or graphics display (including display adapter) is connected differently from the way it was previously, the console selection screen appears. Select a firmware console.
    5. Immediately after the word keyboard is displayed, press the number 1 key on the directly attached keyboard, an ASCII terminal or management console. This activates the system management services (SMS).
    6. Enter the appropriate password if you are prompted to do so.

    Is the SMS screen displayed?

    No
    One of the FRUs remaining in the system unit is defective.

    If you are using an ASCII terminal, go to the problem determination procedures for the display. If you do not find a problem, replace the system backplane.

    Yes
    Go to the next step.
  • PFW1548-9
    1. Make sure the stand-alone diagnostic CD-ROM is inserted into the optical drive.
    2. Turn off the power and remove the power cords.
    3. Use the cam levers to reconnect the disk drive enclosure assembly to the I/O backplane.
    4. Reconnect the removable media or disk drive enclosure assembly.
    5. Plug in the power cords and wait for 01 in the upper-left corner of the operator panel display.
    6. Turn on the power using either the management console or the white button. (If the stand-alone diagnostic CD-ROM is not in the optical drive, insert it now.) If a management console is attached, after the system has reached hypervisor standby, activate a Linux or AIX partition by clicking the Advanced button on the activation screen. On the Advanced activation screen, select Boot in service mode using the default boot list to boot the stand-alone diagnostic CD-ROM.
    7. Immediately after the word keyboard is displayed, press the number 5 key on either the directly attached keyboard or an ASCII terminal keyboard.
    8. Enter the appropriate password if you are prompted to do so.

    Is the "Please define the System Console" screen displayed?

    No:
    One of the FRUs remaining in the system unit is defective.

    Exchange the FRUs in the order listed that have not been exchanged.

    1. Optical drive
    2. Removable media enclosure
    3. System backplane

    Repeat this step until the defective FRU is identified or all the FRUs have been exchanged.

    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

    If the symptom has changed, check for loose cards, cables, and obvious problems. If you do not find a problem, go to Problem Analysis and follow the instructions for the new symptom.

    Yes:
    Go to the next step.
  • PFW1548-10

    The system is working correctly with this configuration. One of the disk drives that you removed from the disk drive backplanes may be defective.

    1. Make sure the stand-alone diagnostic CD-ROM is inserted into the optical drive.
    2. Turn off the power and remove the power cords.
    3. Install a disk drive in the media or disk drive enclosure assembly.
    4. Plug in the power cords and wait for the OK prompt to display on the operator panel display.
    5. Turn on the power.
    6. Immediately after the word keyboard is displayed, press the number 5 key on either the directly attached keyboard or an ASCII terminal keyboard.
    7. Enter the appropriate password if you are prompted to do so.

    Is the "Please define the System Console" screen displayed?

    No
    Exchange the FRUs in the order listed that have not been exchanged.
    1. Last disk drive installed
    2. Disk drive backplane

    Repeat this step until the defective FRU is identified or all the FRUs have been exchanged.

    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

    If the symptom has changed, check for loose cards, cables, and obvious problems. If you do not find a problem, go to Problem Analysis and follow the instructions for the new symptom.

    Yes
    Repeat this step with all disk drives that were installed in the disk drive backplane.

    After all of the disk drives have been reinstalled, go to the next step.

  • PFW1548-11

    The system is working correctly with this configuration. One of the devices that was disconnected from the system backplane may be defective.

    1. Turn off the power and remove the power cords.
    2. Attach a system backplane device (for example: system port 1, system port 2, USB, keyboard, mouse, Ethernet) that had been removed.

      After all of the I/O backplane device cables have been reattached, reattached the cables to the service processor one at a time.

    3. Plug in the power cords and wait for 01 in the upper-left corner on the operator panel display.
    4. Turn on the power using either the management console or the white button. (If the stand-alone diagnostic CD-ROM is not in the optical drive, insert it now.) If a management console is attached, after the system has reached hypervisor standby, activate a Linux or AIX partition by clicking the Advanced button on the activation screen. On the Advanced activation screen, select Boot in service mode using the default boot list to boot the stand-alone diagnostic CD-ROM.
    5. If the Console Selection screen is displayed, choose the system console.
    6. Immediately after the word keyboard is displayed, press the number 5 key on either the directly attached keyboard or on an ASCII terminal keyboard.
    7. Enter the appropriate password if you are prompted to do so.

    Is the "Please define the System Console" screen displayed?

    No
    The last device or cable that you attached is defective.
    To test each FRU, exchange the FRUs in the order listed.
    1. Device and cable (last one attached).
    2. System backplane

    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

    If the symptom has changed, check for loose cards, cables, and obvious problems. If you do not find a problem, go to Problem Analysis and follow the instructions for the new symptom.

    Yes
    Repeat this step until all of the devices are attached. Go to the next step.
  • PFW1548-12

    The system is working correctly with this configuration. One of the FRUs (adapters) that you removed may be defective.

    1. Turn off the power and remove the power cords.
    2. Install a FRU (adapter) and connect any cables and devices that were attached to the FRU.
    3. Plug in the power cords and wait for the OK prompt to display on the operator panel display.
    4. Turn on the power using either the management console or the white button. (If the stand-alone diagnostic CD-ROM is not in the optical drive, insert it now.) If a management console is attached, after the system has reached hypervisor standby, activate a Linux or AIX partition by clicking the Advanced button on the activation screen. On the Advanced activation screen, select Boot in service mode using the default boot list to boot the stand-alone diagnostic CD-ROM.
    5. If the Console Selection screen is displayed, choose the system console.
    6. Immediately after the word keyboard is displayed, press the number 5 key on either the directly attached keyboard or on an ASCII terminal keyboard.
    7. Enter the appropriate password if you are prompted to do so.

    Is the "Please define the System Console" screen displayed?

    No
    Go to the next step.
    Yes
    Repeat this step until all of the FRUs (adapters) are installed. Go to Verifying a repair.
  • PFW1548-13

    The last FRU installed or one of its attached devices is probably defective.

    1. Make sure the stand-alone diagnostic CD-ROM is inserted into the optical drive.
    2. Turn off the power and remove the power cords.
    3. Starting with the last installed adapter, disconnect one attached device and cable.
    4. Plug in the power cords and wait for the 01 in the upper-left corner on the operator panel display.
    5. Turn on the power using either the management console or the white button. (If the stand-alone diagnostic CD-ROM is not in the optical drive, insert it now.) If a management console is attached, after the system has reached hypervisor standby, activate a Linux or AIX partition by clicking the Advanced button on the Advanced activation screen. On the Advanced activation screen, select Boot in service mode using the default boot list to boot the stand-alone diagnostic CD-ROM.
    6. If the Console Selection screen is displayed, choose the system console.
    7. Immediately after the word keyboard is displayed, press the number 5 key on either the directly attached keyboard or on an ASCII terminal keyboard.
    8. Enter the appropriate password if you are prompted to do so.

    Is the "Please define the System Console" screen displayed?

    No
    Repeat this step until the defective device or cable is identified or all devices and cables have been disconnected.

    If all the devices and cables have been removed, then one of the FRUs remaining in the system unit is defective.

    To test each FRU, exchange the FRUs in the order listed.
    1. Adapter (last one installed)
    2. System backplane

    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

    If the symptom has changed, check for loose cards, cables, and obvious problems. If you do not find a problem, go to the Problem Analysis and follow the instructions for the new symptom.

    Yes
    The last device or cable that you disconnected is defective. Exchange the defective device or cable then go to the next step.
  • PFW1548-14
    1. Follow the instructions on the screen to select the system console.
    2. When the DIAGNOSTIC OPERATING INSTRUCTIONS screen is displayed, press Enter.
    3. Select Advanced Diagnostics Routines.
    4. If the terminal type has not been defined, you must use the option Initialize Terminal on the FUNCTION SELECTION menu to initialize the diagnostic environment before you can continue with the diagnostics. This is a separate operation from selecting the console display.
    5. If the NEW RESOURCE screen is displayed, select an option from the bottom of the screen.
      Note: Adapters and devices that require supplemental media are not shown in the new resource list. If the system has adapters or devices that require supplemental media, select option 1.
    6. When the DIAGNOSTIC MODE SELECTION screen is displayed, press Enter.
    7. Select All Resources. (If you were sent here from step PFW1548-18, select the adapter or device that was loaded from the supplemental media).

    Did you get an SRN?

    No
    Go to step PFW1548-16.
    Yes
    Go to the next step.
  • PFW1548-15

    Look at the FRU part numbers associated with the SRN.

    Have you exchanged all the FRUs that correspond to the failing function codes (FFCs)?

    No
    Exchange the FRU with the highest failure percentage that has not been changed.

    Repeat this step until all the FRUs associated with the SRN have been exchanged or diagnostics run with no trouble found. Run diagnostics after each FRU is exchanged. Go to Verifying a repair.

    Yes
    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.
  • PFW1548-16

    Does the system have adapters or devices that require supplemental media?

    No
    Go to step the next step.
    Yes
    Go to step PFW1548-18.
  • PFW1548-17

    Consult the PCI adapter configuration documentation for your operating system to verify that all adapters are configured correctly.

    Go to Verifying a repair.

    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

  • PFW1548-18
    1. Select Task Selection.
    2. Select Process Supplemental Media and follow the on-screen instructions to process the media. Supplemental media must be loaded and processed one at a time.

    Did the system return to the TASKS SELECTION SCREEN after the supplemental media was processed?

    No
    Go to the next step.
    Yes
    Press F3 to return to the FUNCTION SELECTION screen. Go to step PFW1548-14
    substep 4.
  • PFW1548-19

    The adapter or device is probably defective.

    If the supplemental media is for an adapter, replace the FRUs in the following order:

    1. Adapter
    2. System backplane
    If the supplemental media is for a device, replace the FRUs in the following order:
    1. Device and any associated cables
    2. The adapter to which the device is attached

    Repeat this step until the defective FRU is identified or all the FRUs have been exchanged.

    If the symptom did not change and all the FRUs have been exchanged, call service support for assistance.

    If the symptom has changed, check for loose cards, cables, and obvious problems. If you do not find a problem, go to Problem Analysis and follow the instructions for the new symptom.

    Go to Verifying a repair.

    This ends the procedure.