Use this procedure to perform AIX® problem analysis.
If you experience a problem with your AIX server
or logical partition, you should attempt to gather more information about
the problem to either solve it, or to help your next level of support or your
hardware service provider to solve it more quickly and accurately.
Keep
the following in mind while troubleshooting AIX server
problems :
- Has there been an external power outage or momentary power loss?
- Has the hardware configuration changed?
- Has server software been added?
- Have any new programs or program updates been installed recently?
Check the following connections:
- Verify that the power cord is plugged in.
- Verify that all your cables are attached securely.
Has your server ever been configured with one or more logical
partitions?
The server has never been partitioned
and there is no HMC or Integrated Virtualization Manager
- Is the server turned on, or can you turn on your server?
- No: Go to step 2.
- Yes: Ensure that the server is turned on and then
go to step 4.
- Perform the following steps to verify that the
server is receiving power:
- If your server is protected by an emergency power off (EPO) circuit, check
that the EPO switch is not activated.
- If you have an uninterruptible power supply, verify that the cables are
correctly connected to the server, and that it is functioning correctly.
- When a good power source is connected to the server, one of the following
occurs:
- If you have a control panel, the Function/Data display on the control
(operator) panel is illuminated.
- If you do not have a control panel, the Bulk Power Controller system lights
are illuminated.
- Is the control (operator) panel illuminated?
- Yes: Start the server by either pressing the power
button on the control (operator) panel, and then go to step 4.
Note: If the server stops with a reference code appearing in the Function/Data
display on the control (operator) panel, record the reference code and any
related information, and go to
Reference codes list for customers.
This
ends the procedure.
- No: There is a power problem. Verify that the power
source to the server is functioning correctly (for example, the wall outlet
is functioning correctly and the power cord is not damaged). If you cannot
find a problem with the power source, contact your next level of support or
your hardware service provider. This ends the procedure.
- Is the control (operator) panel blank?
- Yes: Go to step 9.
- No: Continue with the next step.
- Is the Attention light on the control (operator) panel illuminated?
- Yes: Go to step 9.
- No: Continue with the next step.
- Are any additional messages related to this problem displayed on
the system console or sent to you in e-mail from the operating system?
- Yes: Continue with the next step.
- No: Contact your next level of support or your
hardware service provider.
- Record any additional message information that is available from
the control (operator) panel, attached displays, or e-mail from the operating
system.
- If the additional message information contains recovery instructions,
follow these instructions.
Did this solve the problem?
- Yes: This ends the procedure.
- No: Continue with the next step.
- Is the operating system functioning?
- Yes: Continue with the next step.
- No: Perform the following steps:
- Obtain a list of error and event log entries from the ASMI's Error/Event
Logs. For details, see the Displaying
error and event logs topic.
- Continue with step 11.
- Record any SRN information that is displayed or available through
e-mail.
Note: If you have not found an SRN, it is possible to display
an SRN using the operating system. Perform the following to display previous
diagnostic results from online diagnostics in concurrent mode:
- Log in to the AIX operating system as root user, or use CE login. If
you need help, contact the system administrator.
- Enter the diag command to load the diagnostic controller,
and display the online diagnostic menus.
- At the Function selection menu, select Task
selection.
- From the Task selection list menu, select Display previous
diagnostic results.
- From the Previous diagnostic results menu, select Display
diagnostic log summary.
A Display diagnostic log will be
shown with a time ordered table of events from the error log. Look in the
T column for the most recent entry that has an S entry. Press Enter to select
the row in the table and then select Commit. The details
of this entry from the table will be displayed; look for the SRN entry shown
near the end of the entry and record the information shown.
- Record all other reference codes (if any are
displayed) that you are receiving on the control (operator) panel. See Collecting reference codes and system information for details.
- Go to the Reference codes list for customers.
The server has been partitioned and
there is an HMC or
an Integrated Virtualization Manager.
If
you have an HMC, it must
be attached and functioning correctly.
- Choose from the following options:
- If you have an HMC, ensure you performed the steps in Beginning problem analysis.
Then return here if you are directed to do so.
- If you are using an Integrated Virtualization Manager,
continue with the next step.
- Can you start the server and at least one logical partition on your server?
- No: Go to step 3.
- Yes: Go to step 5.
- Perform the following steps to verify that the server is receiving
power:
- If your server is protected by an emergency power off (EPO) circuit, check
that the EPO switch is not activated.
- If you have an uninterruptible power supply, verify that the cables are
correctly connected to the server, and that it is functioning correctly.
- When a good power source is connected to the server, one of the following
occurs:
- If you have a control panel, the Function/Data display on the control
(operator) panel is illuminated.
- If you do not have a control panel, the Bulk Power Controller system lights
are illuminated.
- Is the control (operator) panel or Bulk Power Controller illuminated?
- No: There is a power problem. Verify that the power
source to the server is functioning correctly (for example, the wall outlet
is functioning correctly and the power cord is not damaged). If you cannot
find a problem with the power source, contact your next level of support or
your hardware service provider. This ends the procedure.
- Yes: Start the server.
Note:
If the server
stops with a reference code appearing in the Function/Data display on the
control (operator) panel, or on the HMC, or
on the Integrated Virtualization Manager, record
the reference code and any related information, and go to the Reference codes list for customers for
further information. This ends the procedure.
- Is the server's control (operator) panel, the HMC,
or Integrated Virtualization Manager displaying function
11?
Note: If you are using the control panel, use the increment or decrement
buttons to cycle through the functions to determine if function 11 exists.
You can alternate between the function number and the data by pressing Enter.
For details, see
Collecting reference codes and system information.
- Yes: Go to step 9.
- No: Continue with the next step.
- Is the system attention light on?
- Yes: Go to step 9.
- No: Continue with the next step.
- Did you receive a message related to this problem either through the mail
function or shown on the HMC or Integrated Virtualization Manager?
- Yes: Continue with the next step.
- No: Contact your next level of support or your
hardware service provider.
- Record the additional message information on the problem reporting form.
For details, see Using the problem reporting forms. Then follow the
recovery instructions on the Additional Message Information display. Did this
solve the problem?
- Yes: This ends the procedure.
- No: Continue with the next step.
- Perform the following:
- Record all the reference codes that you are receiving on the control (operator)
panel, the HMC, or the Integrated Virtualization Manager. For details, see Collecting reference codes and system information.
- Go to the Reference codes list for customers.