IBM Support

Troubleshooting Management Module issues

Troubleshooting


Problem

This document tells you how to troubleshoot management module problems for the IBM eServer BladeCenter

Resolving The Problem

  1. To begin troubleshooting, check the following top issues. If your issue is listed, select the link, otherwise proceed to step 2.

    Troubleshooting Management Module connectivity issues
    Troubleshooting "SP COMM" and "KERNEL MODE" errors
    Redundant management module is not failing over (8677)
    Blowers ramp up to full speed after upgrade of MM to dual or Redundant MM (8677)
    Power fault not detected in MM error log (BladeCenter T)
    Management module password cannot be reset
    Management module does not complete changeover to redundant module on hardware failure
    Management module error messages
    Service processor in MM reports a general monitor failure

    The management module (MM) stores all event and error information for the BladeCenter. Whether you are configuring a new BladeCenter, modifying the settings of an existing BladeCenter, or trying to find out what is causing a problem in a BladeCenter, the Management Module (MM) is ALWAYS the starting point. In a new BladeCenter, the MM is the only accessible "switch" device in the chassis. Its TCP/IP address is a known quantity and it can be accessed by a Web browser.

    Use the management module in the BladeCenter to manage the BladeCenter and obtain vital system information about your installed blade servers. The management module communicates with the blade servers within the BladeCenter via an RS-485 intermanagement network. You can receive the status and control all blade servers within the BladeCenter. You can shut down and restart any blade server from any where on the network to help save time and costs associated with travel to the actual installation. These manageability functions are provided through a self-contained Web page, creating an easy and familiar way for administrators to monitor, control, and maintain highly available BladeCenter installations.

    This network relays vital information about individual blade servers such as:

    • Temperature
    • Voltages
    • Power supply status
    • Memory status
    • Fan status
    • HDD status
    • Error and status log


    Enterprise/Entry Chassis MM

    Telco Chassis MM

    The MM functions as a system management processor (service processor) and in the enterprise/entry chassis, as a keyboard/video/mouse (KVM) multiplexer for the blade servers. The telco chassis has a separate KVM module. The enterprise/entry MM has an Ethernet connection (located in the LAN module on the telco chassis), to enable the BladeCenter to be configured and managed via a LAN-attached management station. The management module also configures the BladeCenter and modules, configuring such information as mm and switch IP addresses and ethernet VLANs. The service processor in the management module communicates with the service processor in each blade server via an internal management bus to support blade server power-on requests, error and event reporting, requests for keyboard, mouse, and video and requests for diskette drive, CD-ROM drive and USB port. The management module also communicates with the switch modules, power modules, blower modules, and blade servers to detect presence or absence and any error conditions. On the enterprise/entry chassis, the settings of the MM can be reset through a pin recess at the bottom of the MM, below the keyboard connector.

    The rear of BladeCenter (Type 8677) provides bays for power, management modules, switch modules and blowers. An information panel is also present. The rear information panel has the same indicators as the front information panel.

    8677 rear view

    To maintain proper system cooling, all module bays must contain either a module or a filler module; all slots must contain either a blade or a filler module.

  2. Access the Management Module. The only way to access the MM is via its Ethernet ports. Every MM runs a small Web server and will default to a known set of values on first boot. The default TCP/IP address will only be set after a timeout period. By default, a new MM or a MM set to default state will look for a DHCP server to allocate a dynamic IP address. It no lease is granted, the internally programmed static address will be set. This process can take up to four minutes. Assuming the MM is at default address or the client knows the dynamic address, the client or SSR can logon. If you have changed the default ID and password, you must know the new values. The default values:

  • External TCP/IP address is: 192.168.70.125
  • Default user ID is: USERID
  • Default password is: PASSW0RD
    Note:
    The number zero, not the letter O, in PASSW0RD


This screen shows the logon window that will present if a management workstation have successfully attached to the MM IP address:

logon screen

The Web interface embedded in the MM presents the user with screens similar to the ones used with the Remote Supervisor Adapter II. If the login is successful, the user is prompted to chose the session timeout value (in minutes) and press the “continue” button. Notice that the default connection uses a non-secure authentication method, but the firmware of the Management Module does support SSL and users are given the capability to switch to the secure connection at their leisure. To enable the feature the first time, users are required to download a file from the IBM Support Web site. Up to twelve users can be created to allow for more than one administrator to connect.

Web Interface

Connecting to the Management Module requires:

  1. Internet Explorer 4.0 (SP1) or Netscape 4.72 or later (currently not Netscape 6)
  2. Java support
  3. JavaScript 1.2 support
  4. Minimum screen resolution of 800x600 with 256 colors


Assuming you are able to logon to the MM, the System Status screen will appear by default. From here, a "snapshot" of the BladeCenter is presented. Other screens will display detailed information about the health of the BladeCenter and its components. Note that a green circle indicates that all components are normal. A cross in a red box indicates a failure condition and a yellow triangle indicates a non-critical problem. This plus the three other ‘status’ screens hold a wealth of information that will help to isolate the cause of problems on BladeCenter. System Status screen:

system status screen

The system event log (SEL) is a critical part of problem handling. The SEL captures events from a number of sources. Each blade server has a local service processor that monitors and controls functions relevant to the local host (the blade server itself). If the local SP needs to report a fault, the fault information is passed through the midplane to the MM and stored centrally. Similarly, the switch modules in the BladeCenter can report status and fault information to the MM. When working with the SEL, it is possible to "filter" the log file to concentrate on just one blade server or one switch or to filter only critical errors. In this way, you can use the SEL to focus in on a potential source for the problem at hand.

System Event Log

The hardware vital product data (VPD) shows what the MM has recognized as being installed in the BladeCenter. If a device is inserted into the BladeCenter and is not recognized by the MM, it cannot be used. This list acts as an inventory of what is physically present in the chassis. Discrepancies between this inventory and what is physically present may point to faulty hardware or a bad insertion of the device.

Hardware VPD screen

The firmware VPD displays the BIOS and firmware levels of the chassis, switches, blades and management modules within the BladeCenter. There are always recommended minimum levels of code for all flashable devices. Always check the IBM support Web site for the recommended levels of code. It may not be possible to work with a client’s problem if firmware is down-level (or the problem may be caused by down-level firmware).

Firmware VPD screen

  1. If problems remain, try resetting the Management Module to factory defaults. Many Management Module failures are resolved by simply resetting them to factory defaults. Please perform this task before replacing a MM. Please read the section titled "Management module IP reset button".

  2. Check lightpath diagnostics for your BladeCenter:

    BladeCenter (Type 8677)
    BladeCenter JS20
    BladeCenter HS20
    BladeCenter HS40
    BladeCenter T (Type 8720, 8730)


  3. If a new option has been added and your system is not working, complete the following procedure before continuing:
    1. Remove the option that you just added
    2. Run the diagnostic tests to determine if your system is running correctly
    3. Reinstall the new device
    4. See Management module removal and installation movie

  4. Make sure that the BladeCenter unit has the latest level of firmware code installed.

  5. If these steps have not solved your problem, refer to "Need more help?"

The management module password cannot be reset (all models)

If you forget the management-module password, you will not be able to access the BladeCenter management module. The management-module password cannot be overridden, and the management module will need to be replaced.

Back to top


Management module does not complete changeover to redundant module on hardware failure (all models)

Replace the management module.

Back to top


Management module error messages (all models)

Click here for system service part numbers (if part replacement is needed).

Message
Action
Application posted alert to ASM
The alert button on the Web interface was tested. Information only. Take action as required.
System log 75% full
Information only. Take action as required.
System log full
Information only. Take action as required.
Management module network initialization complete
Information only. Take action as required.
Remote login successful. Login ID
Information only. Take action as required.
ASM reset was caused by restoring default values
The management-module assembly was reset after restoring the default settings. Information only. Take action as required.
ASM reset was initiated by the user
Information only. Take action as required.
Pushbutton reset activated: Ethernet configuration reset to default values and MM ASM reset due to watchdog timeout
  1. Reseat the management module.
  2. Reflash the management-module firmware.
  3. Replace the management module.
ASM reset due to XXXXX, instruction fault: XXXXXXXX YYYYYYYY ZZZZZZ
  1. Reseat the management module.
  2. Reflash the management-module firmware.
  3. Replace the management module.
ASM reset reason unknown
Information only.
Possible ASM reset occurred reason unknown
Information only.
Remote access attempt failed. Invalid userid or password received. User is XXX from CMD mode client at IP@=XXX.XXX.XXX.XXX
Failed attempt to log into the management module.
Remote access attempt failed. Invalid userid or password received. User is XXX from WEB browser IP@=XXX.XXX.XXX.XXX
Failed attempt to log into the management module.
DHCP [X] failure, no IP @ assigned (retry X), rc=X
Failed to get IP address by DHCP server. Check the DHCP server connection and settings.
LAN: Command mode tamper triggered. Possible break in attempt.
Unsuccessful attempt to access the management module in command mode. Information only. Take action as required.
LAN: WEB server tamper delay triggered. Possible break in attempt.
Unsuccessful attempt to access the management module in command mode. Information only. Take action as required.
System log cleared.
Information only. Take action as required.

Back to top


Service processor in the management module reports a general monitor failure (all models)

Disconnect the BladeCenter unit from all electrical sources, wait for 30 seconds, reconnect the BladeCenter unit to the electrical sources, and restart the server. If a problem remains, replace the management module.

Back to top


Need more help?
Please select one of the the following options for further assistance:

//www.ibm.com/i/v14/icons/fw.gif Support forums
//www.ibm.com/i/v14/icons/fw.gif Submit a technical question
Before you call IBM Service

 

Document Location

Worldwide

Operating System

BladeCenter:Operating system independent / None

System x Hardware Options:Operating system independent / None

[{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW19V","label":"BladeCenter->BladeCenter HS20"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW20D","label":"BladeCenter JS20 Blade"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW20G","label":"BladeCenter->BladeCenter HS40"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"LOB18","label":"Miscellaneous LOB"}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW20M","label":"BladeCenter->BladeCenter T Chassis"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW20T","label":"BladeCenter->BladeCenter E Chassis"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW21G","label":"BladeCenter->BladeCenter LS20"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW21X","label":"BladeCenter JS21 Blade"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW21Y","label":"BladeCenter H Chassis"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW22E","label":"BladeCenter->BladeCenter HS21"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW22F","label":"BladeCenter->BladeCenter LS21"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW22G","label":"BladeCenter->BladeCenter LS41"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU054","label":"Systems w\/TPS"},"Product":{"code":"HW22H","label":"BladeCenter QS20 Blade"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW22I","label":"BladeCenter->BladeCenter HS21 XM"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"","label":""}},{"Type":"HW","Business Unit":{"code":"BU016","label":"Multiple Vendor Support"},"Product":{"code":"HW22Q","label":"BladeCenter->BladeCenter HT Chassis"},"Platform":[{"code":"PF025","label":"Platform Independent"}],"Line of Business":{"code":"LOB57","label":"Power"}}]

Document Information

Modified date:
29 January 2019

UID

ibm1MIGR-58898