Use this MAP to perform SAS fabric problem isolation.
If you need additional information for failing part numbers,
location codes, or removal and replacement procedures, see Part locations and location codes. Select your machine type and model number to
find additional location codes, part numbers, or replacement procedures
for your system.
Considerations:
- Remove power from the system before connecting and disconnecting
cables or devices, as appropriate, to prevent hardware damage or erroneous
diagnostic results.
- Some systems have the disk enclosure or removable media enclosure
integrated in the system with no cables. For these configurations,
the SAS connections are integrated onto the system boards. A failed
connection can be the result of a failed system board or integrated
device enclosure.
- Some systems have SAS RAID adapters integrated onto the system
boards and use a cache RAID - dual IOA enablement card (for example,
FC5662) to enable storage adapter write cache and dual storage I/O
adapter (IOA) mode (HA RAID mode). For these configurations, replacement
of the Cache RAID - dual IOA enablement card is unlikely to solve
a SAS-related problem because the SAS interface logic is on the system
board. Additionally, appropriate service procedures must be followed
when replacing the Cache RAID - dual IOA enablement card because removal
of this card can cause data loss if incorrectly performed and can
also result in a nondual storage IOA (non-HA) mode of operation.
Attention: When SAS fabric problems exist, obtain
assistance from your hardware service provider before performing any
of the following actions:
- Before you replace a RAID adapter: Because the adapter might contain
nonvolatile write cache data and configuration data for the attached
disk arrays, additional problems can be created by replacing an adapter.
- Before you remove functioning disks in a disk array: The disk
array might become degraded or failed and additional problems might
be created if functioning disks are removed from a disk array.
Attention: Do not remove functioning disks
in a disk array without assistance from your hardware service support
organization. A disk array might become degraded or might fail if
functioning disks are removed, and additional problems might be created.
Step 4150-2
The possible
causes are:
- More devices are connected to the adapter than the adapter supports.
Change the configuration to the allowable number of devices.
- A SAS device has been improperly moved from one location to another.
Either return the device to its original location or move the device
while the adapter is powered off or unconfigured.
- A SAS device has been improperly replaced by a SATA device. A
SAS device must be used to replace a SAS device.
When the problem is resolved, see the removal and replacement
procedures topic for the system unit on which you are working and
do the "Verifying the repair" procedure.
Step 4150-3
Determine if
any of the disk arrays on the adapter are in a Degraded state
as follows:
- Start the IBM® SAS Disk Array
Manager.
- Start Diagnostics and select Task Selection on
the Function Selection display.
- Select RAID Array Manager.
- Select IBM SAS Disk Array Manager.
- Select List SAS Disk Array Configuration.
- Select the IBM SAS RAID Controller identified
in the hardware error log.
Does any disk array have a state of Degraded?
- No
- Go to Step 4150-5.
- Yes
- Go to Step 4150-4.
Step 4150-4
Other errors
might have occurred related to the disk array being in a Degraded state.
Take action on these errors to replace the failed disk and restore
the disk array to an Optimal state.
When the problem
is resolved, see the removal and replacement procedures topic for
the system unit on which you are working and do the "Verifying the
repair" procedure.
Step 4150-5
Have other errors
occurred at the same time as this error?
- No
- Go to Step 4150-7.
- Yes
- Go to Step 4150-6.
Step 4150-6
Take action
on the other errors that have occurred at the same time as this error.
When
the problem is resolved, see the removal and replacement procedures
topic for the system unit on which you are working and do the "Verifying
the repair" procedure.
Step 4150-8
Ensure device,
device enclosure, and adapter microcode levels are up to date.
Did
you update to newer microcode levels?
- No
- Go to Step 4150-10.
- Yes
- Go to Step 4150-9.
Step 4150-9
When the problem
is resolved, see the removal and replacement procedures topic for
the system unit on which you are working and do the "Verifying the
repair" procedure.
Step 4150-11
Identify the
adapter SAS port that is associated with the problem by examining
the hardware error log. The hardware error log might be viewed as
follows:
- Follow the steps in Examining the hardware error log and
return here.
- Select the hardware error log to view. In the hardware error log
under the Disk Information heading, the Resource field
can be used to identify which controller port the error is associated
with.
Note: If you do not see the
Disk Information heading in
the error log, obtain the
Resource field from
the
Detail Data / PROBLEM DATA section as illustrated in the
following example:
Detail Data
PROBLEM DATA
0000 0800 0004 FFFF 0000 0000 0000 0000 0000 0000 1910 00F0 0408 0100 0101 0000
^
|
Resource is 0004FFFF
Go to
Step 4150-12.
Step 4150-12
Using the
resource found in the previous step, see SAS
resource locations to understand
how to identify the port of the controller to which the device, or
device enclosure, is attached.
For example, if the resource
were equal to 0004FFFF, port 04 on the adapter is used to attach the
device or device enclosure that is experiencing the problem.
The
resource found in the previous step can also be used to identify the
device. To identify the device, you can attempt to match the resource
with the one found on the display, that is displayed by performing
the following steps.
- Start the IBM SAS Disk Array
Manager:
- Start the diagnostics program and select Task Selection from
the Function Selection display.
- Select RAID Array Manager.
- Select IBM SAS Disk Array
Manager.
- Select Diagnostics and Recovery Options.
- Select Show SAS Controller Physical Resources.
- Select Show Physical Resource Locations.
Step 4150-13
Because the
problem persists, some corrective action is needed to resolve the
problem. Using the port or device information found in the previous
step, proceed by doing the following steps.
- Power off the system or
logical partition.
- Perform only one of the following corrective actions, which are
listed in the order of preference. If one of the corrective actions
has previously been attempted, proceed to the next one in the list.
Note: Before replacing parts, consider using a complete power down
of the entire system, including any external device enclosures, to
provide a reset of all possible failing components. This action might
correct the problem without replacing parts.
- Power on the system or
logical partition.
Note: In some situations, it might be acceptable
to unconfigure and reconfigure the adapter instead of powering off
and powering on the system or logical partition.
Step 4150-14
Does the problem
still occur after performing the corrective action?
- No
- Go to Step 4150-15.
- Yes
- Go to Step 4150-13.
Step 4150-15
When the problem
is resolved, see the removal and replacement procedures topic for
the system unit on which you are working and do the "Verifying the
repair" procedure.