HMC problem analysis

Use this information to diagnose and repair problems that are related to the Hardware Management Console (HMC).

About this task

DANGER
When working on or around the system, observe the following precautions:

Electrical voltage and current from power, telephone, and communication cables are hazardous. To avoid a shock hazard: If IBM supplied the power cord(s), connect power to this unit only with the IBM provided power cord. Do not use the IBM provided power cord for any other product. Do not open or service any power supply assembly. Do not connect or disconnect any cables or perform installation, maintenance, or reconfiguration of this product during an electrical storm.

  • L003 label image The product might be equipped with multiple power cords. To remove all hazardous voltages, disconnect all power cords. For AC power, disconnect all power cords from their AC power source. For racks with a DC power distribution panel (PDP), disconnect the customer’s DC power source to the PDP.
  • When connecting power to the product ensure all power cables are properly connected. For racks with AC power, connect all power cords to a properly wired and grounded electrical outlet. Ensure that the outlet supplies proper voltage and phase rotation according to the system rating plate. For racks with a DC power distribution panel (PDP), connect the customer’s DC power source to the PDP. Ensure that the proper polarity is used when attaching the DC power and DC power return wiring.
  • Connect any equipment that will be attached to this product to properly wired outlets.
  • When possible, use one hand only to connect or disconnect signal cables.
  • Never turn on any equipment when there is evidence of fire, water, or structural damage.
  • Do not attempt to switch on power to the machine until all possible unsafe conditions are corrected.
  • When performing a machine inspection: Assume that an electrical safety hazard is present. Perform all continuity, grounding, and power checks specified during the subsystem installation procedures to ensure that the machine meets safety requirements. Do not attempt to switch power to the machine until all possible unsafe conditions are corrected. Before you open the device covers, unless instructed otherwise in the installation and configuration procedures: Disconnect the attached AC power cords, turn off the applicable circuit breakers located in the rack power distribution panel (PDP), and disconnect any telecommunications systems, networks, and modems.
  • Connect and disconnect cables as described in the following procedures when installing, moving, or opening covers on this product or attached devices.

    To Disconnect: 1) Turn off everything (unless instructed otherwise). 2) For AC power, remove the power cords from the outlets. 3) For racks with a DC power distribution panel (PDP), turn off the circuit breakers located in the PDP and remove the power from the Customer's DC power source. 4) Remove the signal cables from the connectors. 5) Remove all cables from the devices.

    To Connect: 1) Turn off everything (unless instructed otherwise). 2) Attach all cables to the devices. 3) Attach the signal cables to the connectors. 4) For AC power, attach the power cords to the outlets. 5) For racks with a DC power distribution panel (PDP), restore the power from the Customer's DC power source and turn on the circuit breakers located in the PDP. 6) Turn on the devices.

  • Sharp edges, corners and joints may be present in and around the system. Use care when handling equipment to avoid cuts, scrapes and pinching. (D005)

If you were directed here from the Beginning problem analysis procedure because your HMC is not functioning correctly, continue with the Entry point for HMC problem determination.

To perform other maintenance tasks on your HMC, see following procedures:

Entry point for HMC problem determination

About this task

Find the symptom you are having in the Symptom column of the following table. Then perform the action described in the Action column.
Symptom Action
Operator reported that the HMC did not start, but no other problems were reported. Go to Beginning HMC problem determination.
Operator reported Communication not active on the HMC. Go to Testing the HMC Ethernet adapter.
Operator reported communication problems with a remotely connected HMC or a managed system. Go to Testing the modem connection to the managed system.
Power problems Go to Testing for a power problem.
HMC boot problems Go to Beginning HMC problem determination.
Display problem Go to Testing the HMC display.
DVD-RAM drive problem Go to Testing the HMC DVD-RAM drive.
Diskette drive problem Go to Testing the HMC diskette drive.
Ethernet LAN problem Go to Testing the HMC Ethernet adapter.
A problem with any of the following:
  • display
  • diskette drive
  • DVD-RAM Drive
  • disk drive
  • Ethernet LAN
Perform Dynamic System Analysis. For more information about Dynamic System Analysis, see Diagnostic tools.
Eight character error code beginning with HMC was received when using the HMC graphical user interface. Go to the HMC system reference code.
HMC does not communicate through the modem. Go to Testing the HMC modem connection.
Problems understanding the usage of the HMC. Go to Managing the Hardware Management Console.
All other problems (for example: HMC GUI is unresponsive, parity errors, power, POST codes, blank display, mouse, or keyboard). Go to Beginning HMC problem determination.
Symptoms not in this list. Go to Beginning HMC problem determination.

Beginning HMC problem determination

About this task

Use this procedure to determine if there is a problem with the HMC hardware. This procedure might direct you to procedures in various sections of this information or to server maintenance information.

Step 1. HMC problem determination

Procedure

  1. If the HMC is running, shut down the console by exiting the graphical user interface. The server power turns off automatically. If the server cannot turn off the power, then turn the power switch off.
  2. Turn on the HMC power.
  3. Watch the console and allow enough time for the system to complete the POST and load the HMC machine code.
  4. Watch and listen for the following failing symptoms during power-on:
    • POST error condition.
    • A series of beeps that indicate an error condition.
    • The HMC login screen and user interface fails to start.
    • A reference code or any other error information is displayed.
  5. Do you have any of the failing symptoms during power on?

Step 2. HMC problem determination

Procedure

  1. Perform Dynamic System Analysis. For more information about Dynamic System Analysis, see Diagnostic tools.
  2. Did the system unit tests detect errors?
  3. Use the Dynamic System Analysis and the maintenance procedures for the type of server that you are working on to isolate the failure and exchange customer replaceable units (CRUs).
  4. When the problem is repaired, or if the problem cannot be isolated, continue with Step 4. HMC problem determination.

Step 3. HMC problem determination

About this task

Attention: This step requires HMC support assistance. Contact HMC support before continuing.

Procedure

  1. If you are directed to reload the HMC from the recovery DVD and then reload the backup profile and configuration data, see Reinstalling the HMC machine code.
  2. After reloading the machine code from the recovery DVD, does the HMC start correctly?
    • No: Contact your next level of support.
    • Yes: This ends the procedure.

Step 4. HMC problem determination

About this task

Note: If you reach this step and you have not been able to isolate a failure, contact your next level of support for assistance.

Procedure

  1. Reinstall all CRUs that did not fix the problem.
  2. You must have performed a repair action to continue. If you have not already done so, verify the repair. Perform Dynamic System Analysis. For more information about Dynamic System Analysis, see Diagnostic tools.
  3. Did the system unit tests run without errors?
    • No: Use the Dynamic System Analysis and the maintenance procedures for the type of server that you are working on to isolate the failure and exchange customer replaceable units (CRUs). Refer to the publications listed in the Equivalent maintenance information for the HMC server hardware. Then continue with step 4.
    • Yes: Continue with step 4.
  4. Does the HMC communicate with all connected managed systems?

Testing the HMC

About this task

Use these procedures when you are directed to them from the HMC problem analysis procedure to test the HMC. If a failure is detected, you will be instructed to fix the failing part and then verify the repair.

Testing for a power problem

About this task

To troubleshoot a power problem on the server, see service documentation for the server on which your HMC is based. For server hardware maintenance manuals to help isolate the problem to a failing part, see publications listed in Equivalent maintenance information for the HMC server hardware.

Performing diagnostic procedures

About this task

You should have been directed here to test a specific part of the HMC. For problems in the following areas, perform Dynamic System Analysis:
  • Display
  • Keyboard
  • Mouse
  • Floppy Drive
  • DVD-RAM
  • DASD (disk drive)
  • Memory
  • Power
  • Run All Selected
  • SCSI
  • System Port/Modem
  • 16/4 Port Serial
  • Ethernet

To access the HMC diagnostic information, follow the procedures in HMC diagnostics.

Testing the modem connection to the managed system

About this task

Use this procedure to test the modem connection to the server for the HMC.

Procedure

  1. Can the HMC be used to communicate through the modem?
    • No: Go to step 2.
    • Yes: This ends the procedure.
  2. Is a device other than a modem attached to system port 2 on the HMC?
    Note: If the HMC is a rack-mounted model, answer no to this question.
  3. System port 2 of the HMC is reserved for external modem use only. Move the serial cable from serial port 2 of the HMC to another HMC system port. Connect the modem to system port 2 and go to step 1.

Testing the HMC modem connection

About this task

Use this procedure to test the modem connection to the server for the HMC.

Procedure

  1. Verify that the modem and the telephone line are functioning correctly by performing the following steps:
    1. On the HMC console, open the Service Agent application.
    2. Select Test Tools.
    3. Initiate a Test PMR.
    4. Monitor the call log to verify that the call is completed successfully. If the call is completed successfully, the modem is functioning correctly.
  2. Is the installed modem currently functioning on the HMC?
    • No: Go to step 3.
    • Yes: The problem is not in the modem. This ends the procedure.
  3. Are the HMC configurations, relating to the modem operation, correct?
    • No: Correct the HMC configuration arguments. Return to step 1.
    • Yes: Continue with step 4.
  4. Is the modem powered on? (Are any indicators lit?)
    • No: Ensure the modem is powered on. For details, see 1. After power on verification is completed, continue with step 5.
    • Yes: Go to step 6.
  5. Is the serial cable between the serial (COM) port connector of the HMC and the modem attached?
    • No: Attach the serial cable between the serial (COM) port connector of the HMC and the modem.
    • Yes: Go to step 6.
  6. Is the modem connected correctly to a working telephone line or an equivalent?
    Note: This can be checked by connecting a known good telephone to the line in place of the modem and making a telephone call.
    • No: Correctly connect the telephone line (or equivalent) to the modem. Go to step 1. After completing the telephone line and modem verification test, continue with step 7.
    • Yes: Go to step 7.
  7. Verify the COM port by performing the following steps:
    1. Disconnect the modem cable from the COM port of the HMC.
    2. Select Diagnostics from the top menu.
    3. Select System Ports from the pull-down menu. The SERIAL PORT TEST CATEGORY screen displays.
    4. Ensure the following:
      • On desktop HMC models - IRQ numbers 4 and 3 are assigned to COM 1 and COM 2, and the planar to COM 2 connector cable is present and correctly installed.
      • On rack-mounted models - IRQ number 4 is assigned from COM 1.
      Note: If the preceding information is not correct, the COM port might be disabled, or might be incorrectly configured. This can be resolved by accessing the setup utility (by pressing the F1 key) during power on.
    5. Ensure all diagnostics except External Loopback are selected.
    6. Select Run Screen from the bottom menu.
    7. Ensure all selected diagnostics show Passed.
      Note: If any diagnostics fail, replace the planar.
  8. Verify the external modem by performing the following steps:
    1. Reconnect the modem cable to the correct COM port.
    2. Ensure that the modem is powered on, connected to a working telephone line, and is securely cabled to the communications cable.
    3. Shut down and restart the HMC.
    4. Select Hardware Info from the top menu.
    5. Select COM and LPT Ports from the menu. The hardware query displays COM and LPT port information.
    6. Verify the following:
      • A modem was detected on the correct COM port
      • The modem test returned Passed
      • Dial tone: Detected
      • ATI1: Displays the modem's model information
    Note:
    1. If you were not able to get the desired results in step 8.f, see MultiTech MultiModemII user's guide, installation guide, or reference guide for your modem. To access the MultiTech MultiModemII documentation, go to the MultiTech Product Support website.
    2. If necessary, after you have completed the MultiTech MultiModemII documentation, return here to complete the final step of this procedure.
  9. Choose from the following options:
    • If the modem was not detected, replace and verify the following customer replaceable parts in the order listed:
      1. Communications cable
      2. Modem
    • If the modem test did not return with Passed, replace the modem.
    • If the dial tone did not return with Detected, verify the telephone line operation then retest. If the failure recurs, replace the modem.

Testing the HMC Ethernet adapter

About this task

Use this procedure to test the Ethernet adapter in the HMC.

Procedure

  1. Is the Ethernet port currently functioning though normal operation of the HMC?
    • No: Go to step 2.
    • Yes: This ends the procedure.
  2. Are the Ethernet configuration values set correctly? (IP address, subnet mask, and so on.)
    • No: Set the Ethernet configuration values to their correct settings. Then go back to step 1.
    • Yes: Go to step 3.
  3. Can the IP address of the HMC be 'pinged' by another system that should be able to 'see' the HMC on the network?
    • No: Go to step 4.
    • Yes: Go to step 8.
  4. Is the Ethernet cable attached correctly to the HMC and the network?
    • No: Attach the HMC to the network using an Ethernet cable with the correct pinout. Then go to step 1.
    • Yes: Go to step 5.
  5. Is the Ethernet cable the correct pinout? (There are two types of Ethernet cables in use, which are distinguished by different pinouts. The network will determine which version of cable to use.)
    • No: Replace the Ethernet cable with the correct version. Then go to step 1.
    • Yes: Go to step 6.
  6. Refer to the Ethernet hardware's hardware maintenance manual to determine if there are any internal settings or jumpers that might disable the Ethernet port.
    Are there any internal settings or jumpers?
    • No: Go to step 7.
    • Yes: Go to step 8.
  7. Replace the Ethernet hardware in the HMC. (This might be a PCI card or system board replacement, depending on the HMC hardware.) Go to step 1.
  8. Set the internal settings/jumpers to enable the Ethernet port on the HMC. Go to step 1.

    The failure does not seem to be in the HMC.

Testing the HMC disk drive

Learn how to test and run diagnostics on a failing HMC disk drive.

About this task

To test for HMC disk drive problems, complete the following steps:

Procedure

  1. Did the disk drive test fail?
    • No: Go to step 5.
    • Yes: Continue with the next step.
  2. Perform the following steps:
    1. Exchange the CRUs called by the diagnostics one at a time. For CRU removal and replacement instructions, see server hardware maintenance manual for the system on which you are working. See Equivalent maintenance information for the HMC server hardware to access the hardware maintenance manual for your HMC server model.
    2. After each CRU is exchanged, test the repair. Perform Dynamic System Analysis on the disk drive. For more information about Dynamic System Analysis, see Diagnostic tools.
      Did the disk drive test fail?
      • No: Continue with the next step.
      • Yes: Contact your next level of support.
  3. Ensure the following and then continue with the next step:
    • If you exchanged the disk drive and there are jumpers or tab settings on the new disk drive, ensure that the settings are the same as the old drive.
    • If there is a SCSI cable-terminating resistor device, ensure it is secured to the cable and (if necessary) reattached to its original location on the server.

      Go to the information about hard disk jumper settings in the server hardware maintenance manual. See Equivalent maintenance information for the HMC server hardware to access the hardware maintenance manual for your HMC server model.

  4. If you exchanged the disk drive, restore the HMC image to the new disk drive.
  5. Perform Dynamic System Analysis to test the server. For more information about Dynamic System Analysis, see Diagnostic tools.
    • If the tests fail, isolate the problem using the Beginning HMC problem determination procedure.
    • If the tests run without errors, turn off the server power and then turn on the power. Ensure that the system boots and the HMC screen is displayed. This ends the procedure.

Testing the HMC DVD-RAM drive

Learn how to test and run diagnostics on a failing DVD-RAM.

About this task

To test for HMC DVD-RAM drive problems, complete the following steps:

Procedure

  1. Determine the media in the DVD-RAM drive:
    • Compact Disk Recordable (CD-R) similar to a CD
    • DVD-RAM media cartridge
    Is the media a CD-R?
    • No: Go to step 4.
    • Yes: Continue with the next step.
  2. Perform the following steps:
    1. Clean the compact disk as follows:
      • Hold the disk by its edges. Do not touch the surface.
      • Remove dust and fingerprints from the surface by wiping from the center to the outside using a dry, soft cloth.
    2. Reinstall the CD, with the label-side facing up.
    3. Continue with the next step.
  3. Try the failing task again by using the original media.
    Does the failure occur again?
    • No: This ends the procedure.
    • Yes: Continue with the next step.

  4. Ensure the write protect tab is in the "disabled" (down) position.
    Was the write protect tab in the "disabled" (down) position?
    • No: Go to step 3.
    • Yes: Continue with the next step.
  5. Perform the following steps:
    1. With the original media in the drive, note the following:
    2. Turn on the server power and perform Dynamic System Analysis to test the DVD-RAM drive. For more information about Dynamic System Analysis, see Diagnostic tools.
    3. When the test is complete, continue with the next step.
  6. Did the DVD-RAM test fail while testing with the original media?
    • No: Go to step 15.
    • Yes: Continue with the next step.
  7. Exchange the original media with a new one.
    Note: If you are replacing DVD-RAM media, the new cartridge must be formatted. If possible, use another HMC to format the new cartridge.
  8. Turn off the server power.
  9. Turn on the server power, and perform Dynamic System Analysis to test the DVD-RAM drive with the new media. For more information about Dynamic System Analysis, see Diagnostic tools.
  10. Did the DVD-RAM test fail while testing with the new media?
    • No: The original media was defective. This ends the procedure.
    • Yes: Continue with the next step.
  11. Verify the following:
    • All DVD-RAM drive data and power cables are secure.
    • The DVD-RAM drive is jumpered as "master" and is cabled to the secondary IDE bus.
  12. If the diagnostics continue to fail, exchange the DVD-RAM drive.
    When complete, run the DVD-RAM test again.
    Note: If there are any jumpers or tab settings on the new drive, ensure that the settings match the old drive.
  13. Did the DVD-RAM Drive test continue to fail?
    • No: The original DVD-RAM drive was defective. This ends the procedure.
    • Yes: Continue with the next step.
  14. Continue exchanging CRU from the CRU list and running the DVD-RAM drive tests.
    • If the CRUs fix the problem, this ends the procedure.
    • If you cannot isolate the problem, call your next level of support for assistance.
  15. The server resources (for example: interrupt, I/O address) might be configured incorrectly. Verify the server resources are correctly configured.
    1. Select System Unit for the configuration area, and verify configuration for the system unit and all adapters.
    2. When you complete the verification, try the failing procedure again and continue with the next step.
  16. Does the failing procedure continue to fail?
    • No: The resource settings were incorrect. This ends the procedure.
    • Yes: If you cannot isolate the problem, contact your next level of support for assistance. This ends the procedure.

Testing the HMC diskette drive

Learn to run a diagnostic test to determine diskette drive problems.

About this task

To test for HMC diskette drive problems, complete the following steps:

Procedure

  1. Perform the following steps:
    1. Turn on the server power and perform the Dynamic System Analysis to test the diskette drive. For more information about Dynamic System Analysis, see Diagnostic tools.
      Note: Do not test with the diskette on which the errors occurred. Use a new diskette.
    2. When the test is complete, continue with the next step.
  2. Did the diskette test fail while testing with a new diskette?
    • No: Go to step 6.
    • Yes: Continue with the next step.
  3. Exchange the diskette drive and run the diskette test again.
  4. Did the diskette test fail again?
    • No: The original diskette drive was failing. This ends the procedure.
    • Yes: Continue with the next step.
  5. Continue exchanging CRUs from the CRU list and running tests. If one of the replaced CRUs fixes the problem, this ends the procedure. If you cannot resolve the problem, contact your next level of support for assistance.
  6. Did the original failure occur while writing to a diskette?
    • No: Go to step 8.
    • Yes: Continue with the next step.
  7. Try the original task again by using a new diskette.
    • If the failure occurs again, go to step 10.
    • If no failures occur, the original diskette was failing. This ends the procedure.
  8. Re-create the information on the diskette, or obtain a new diskette with the information.
  9. Try the original task again.
    • If the failure occurs again, continue with the next step.
    • If no failures occur, the original diskette was failing. This ends the procedure.
  10. Perform Dynamic System Analysis to test the diskette drive. For more information about Dynamic System Analysis, see Diagnostic tools.
    • If the tests fail, isolate the problem using the procedures found in the server hardware maintenance manual. For additional server maintenance information, see Equivalent maintenance information for the HMC server hardware to access the hardware maintenance manual for your HMC server model.
    • If the tests do not isolate the problem, contact your next level of support for assistance.

    This ends the procedure.

Testing the HMC display

Learn to test and diagnose HMC display problems.

About this task

To test for HMC display problems, complete the following steps:

Procedure

  1. Is the display type a 95xx (17P, 17X, 21P)?
    • No: Continue with the next step.
    • Yes: 95xx-xxx repairs might require replacing internal display CRUs.

      Repair and test the display using the procedures in Monitor Hardware Maintenance Manual Vol 2, S41G-3317.

  2. Is the display type a 65xx (P70, P200)?
    • No: Continue with the next step.
    • Yes: 65xx-xxx repairs might require replacing the entire display. There are no internal display CRUs. Repair and test the display using the procedures in Monitor Hardware Maintenance Manual Vol 3, P and G series, S52H-3679.

      When the test and repair are complete, continue with step 5.

  3. Is the display type a 65xx (P72, P202)?
    • No: Continue with the next step.
    • Yes: 65xx-xxx repairs might require replacing the entire display. There are no internal display CRUs. Repair and test the display using the procedures in Color Monitor Operating Instructions.

      When the test and repair are complete, continue with step 5.

  4. Repair and test the display using the documentation included with the display.
    When the test and repair are complete, continue with step 6.
  5. Verify the repair. Perform Dynamic System Analysis to test the display. For more information about Dynamic System Analysis, see Diagnostic tools.

    When the test and repair are complete, continue to step 6.

  6. Return the system to normal operations.
    This ends the procedure.