Use this procedure to perform Linux® problem analysis.
If you experience a problem with your
Linux system
or logical partition, attempt to gather more information about the problem
to either solve it, or to help your next level of support or your hardware
service provider to solve it more quickly and accurately.
Keep
the following in mind while troubleshooting Linux problems:
- Has there been an external power outage or momentary power loss?
- Has the hardware configuration changed?
- Has system software been added?
- Have any new programs or program updates been installed recently?
Check the following connections:
- Verify that the power cord is plugged in.
- Verify that your cables are attached securely.
Has your server ever been configured with one or more logical partitions?
The server has never been partitioned
and there is no HMC or Integrated Virtualization Manager
- Is the server turned on, or can you turn on your server?
- No: Go to step 2.
- Yes: Ensure that the server is turned on and then
go to step 4.
- Perform the following steps to verify that the server
is receiving power:
- If your server is protected by an emergency power off (EPO) circuit, check
that the EPO switch is not activated.
- If you have an uninterruptible power supply, verify that the cables are
correctly connected to the server, and that it is functioning correctly.
- When a good power source is connected to the server, one of the following
occurs:
- If you have a control panel, the Function/Data display on the control
(operator) panel is illuminated.
- If you do not have a control panel, the Bulk Power Controller system lights
are illuminated.
- Is the control (operator) panel illuminated?
- Yes: Start the server by pressing the power button
on the control (operator) panel, and then go to step 4.
Note: If the server stops with a reference code appearing in the Function/Data
display on the control (operator) panel, record the reference code and any
related information, and go to
Reference codes list for customers.
This
ends the procedure.
- No: There is a power problem. Verify that the power
source to the server is functioning correctly (for example, the wall outlet
is functioning correctly and the power cord is not damaged). If you cannot
find a problem with the power source, contact your next level of support or
your hardware service provider. This ends the procedure.
- Is the control (operator) panel displaying a reference
code?
- Yes: Continue with the next step.
- No: Go to step 9.
- Is the Attention light on the control (operator) panel illuminated?
- Yes: Go to step 9.
- No: Continue with the next step.
- Are any additional messages (for example a device is not available
or reporting errors) related to this problem displayed on the system console
or sent to you in e-mail from the operating system?
- Yes: Continue with the next step.
- No: Contact your next level of support or your
hardware service provider.
- Record any additional message information that is available from
the control (operator) panel, attached displays, or e-mail from the operating
system.
- If the additional message information contains recovery instructions,
follow these instructions.
Did this solve the problem?
- Yes: This ends the procedure.
- No: Continue with the next step.
- Is the operating system functioning?
- Yes: Continue with the next step.
- No: Perform the following steps:
- Refer to the ASMI's Error/Event Logs to obtain a list of error and event
log entries. For details, see the Displaying
error and event logs topic.
- Continue with step 11.
- Run the eServer™ stand-alone diagnostics
in Problem Determination mode. For details, see Running the eServer stand-alone diagnostics from CD-ROM.
Record any SRN information that is displayed or available through e-mail. When you run the eServer stand-alone diagnostics
in Problem Determination mode, you are given the option to test the resources
that the diagnostic programs find in your server. Be sure to check the list
of available resources in your server to make sure that all resources that
you know are installed are also available to be tested. If you find that a
resource you know to be installed in your system is not available to be tested,
record any information that is available about the missing resource, and check
to ensure that the missing resource is installed correctly. If you cannot
correct the problem with a missing resource, replace the missing resource
(contact your service provider if necessary).
- Record all other reference codes (if any are displayed)
that you are receiving on the control (operator) panel. See Collecting reference codes and system information for details.
- Go to the Reference codes list for customers.
The server has been partitioned and
there is an HMC or Integrated Virtualization Manager.
If you have an HMC, it must be attached and functioning
correctly.
- Choose from the following options:
- If you have an HMC, ensure you performed the steps in Beginning problem analysis.
Then return here if you are directed to do so.
- If you are using an Integrated Virtualization Manager,
continue with the next step.
- Can you start the server and at least one logical partition on your server?
- No: Go to step 3.
- Yes: Go to step 5.
- Perform the following steps to verify that the server
is receiving power:
- If your server is protected by an emergency power off (EPO) circuit, check
that the EPO switch is not activated.
- If you have an uninterruptible power supply, verify that the cables are
correctly connected to the server, and that it is functioning correctly.
- When a good power source is connected to the server, one of the following
occurs:
- If you have a control panel, the Function/Data display on the control
(operator) panel is illuminated.
- If you do not have a control panel, the Bulk Power Controller system lights
are illuminated.
- Is the control (operator) panel or Bulk Power Controller illuminated?
- No: There is a power problem. Verify that the power
source to the server is functioning correctly (for example, the wall outlet
is functioning correctly and the power cord is not damaged). If you cannot
find a problem with the power source, contact your next level of support or
your hardware service provider. This ends the procedure.
- Yes: Start the server.
Note:
If the server
stops with a reference code appearing in the Function/Data display on the
control (operator) panel, or on the HMC, or
on the Integrated Virtualization Manager, record
the reference code and any related information, and go to the Reference codes list for customers for
further information. This ends the procedure.
- Is the server's control (operator) panel, HMC,
or Integrated Virtualization Manager displaying function
11?
Note: If you are using the control panel, use the increment or decrement
buttons to cycle through the functions to determine if function 11 exists.
You can alternate between the function number and the data by pressing Enter.
For details, see
Collecting reference codes and system information.
- Yes: Go to step 10.
- No: Continue with the next step.
- Is the system attention light on?
- Yes: Go to step 10.
- No: Continue with the next step.
- Did you receive a message related to this problem either through the mail
function or shown on the HMC or Integrated Virtualization Manager?
- Yes: Continue with the next step.
- No: Contact your next level of support or your
hardware service provider.
- Record the additional message information on the problem reporting form.
For details, see Using the problem reporting forms. Then follow the
recovery instructions on the Additional Message Information display. Did this
solve the problem?
- Yes: This ends the procedure.
- No: Continue with the next step.
- Record any SRN information that is displayed or available through e-mail.
If you do not have any SRN information, run the eServer stand-alone
diagnostics in Problem Determination mode. For details, see Running the eServer stand-alone diagnostics from CD-ROM and
perform any repair actions.
- Perform the following:
- Record all the reference codes that you are receiving on the control (operator)
panel, the HMC, or the Integrated Virtualization Manager. For details, see Collecting reference codes and system information.
- Go to the Reference codes list for customers.