Use this procedure to perform serial attached SCSI (SAS)
fabric problem isolation.
Considerations:
- Power off the system, partition, or card slot before connecting
and disconnecting cables or devices, as appropriate, to prevent hardware
damage.
- Some systems have SAS, PCI-X, and PCIe bus interface logic integrated
onto the system boards and use a pluggable RAID enablement card (a
non-PCI form factor card) for these SAS, PCI-X, and PCIe buses. For
these configurations, replacement of the RAID enablement card is unlikely
to solve a SAS related problem because the SAS interface logic is
on the system board.
- Some systems have the disk enclosure or removable media enclosure
integrated in the system with no cables. For these configurations
the SAS connections are integrated onto the system boards and a failed
connection can be the result of a failed system board or integrated
device enclosure.
- Some systems have SAS RAID adapters integrated onto the system
backplane and use a cache RAID and dual IOA enablement card to enable
storage adapter write cache and dual storage I/O adapter (IOA) mode.
For these configurations, replacement of the cache RAID and dual IOA
enablement card is unlikely to solve a SAS-related problem because
the SAS interface logic is on the system backplane.
Attention: When SAS fabric problems exist, obtain
assistance from your hardware service provider:
- When SAS fabric problems exist, do not replace RAID adapters without
assistance from your service provider. Because the adapter might contain
nonvolatile write cache data and configuration data for the attached
disk arrays, additional problems can be created by replacing an adapter.
- Follow appropriate service procedures when replacing the Cache
RAID and dual IOA enablement card. Incorrect removal can result in
data loss or a nondual storage IOA mode of operation.
- Do not remove functioning disk units in a disk array without assistance
from your service provider. A disk array might become unprotected
or might fail if functioning disk units are removed. The removal of
functioning disk units might also result in additional problems in
the disk array.
- Was the SRC xxxx3020 or SRC xxxx8130?
- No:
- Go to step 3.
- Yes:
- Go to step 2.
- Determine which of the following is the
cause of your specific error and take the appropriate actions listed.
The possible causes for SRC
xxxx3020 are:
- More devices are connected to the adapter than the adapter supports.
Change the configuration to reduce the number of devices below what
is supported by the adapter.
- A SAS device has been incorrectly moved from one location to another.
Either return the device to its original location or move the device
while the adapter is powered off.
- A SAS device has been incorrectly replaced by a SATA device. A
SAS device must be used to replace a SAS device.
The possible causes for SRC
xxxx8130 are:
- One or more SAS devices were moved from a PCIe2 or PCIe3 adapter
to a PCI-X or PCIe adapter. If the device was moved from a PCIe2 or
PCIe3 adapter to a PCI-X or PCIe adapter, the Detail Data section
of the hardware error log contains a reason for failure of Payload
CRC Error. For this case, the error can be ignored and the
problem is resolved if the devices are moved back to a PCIe2 or PCIe3
adapter or if the devices are formatted on the PCI-X or PCIe adapter.
- For all other causes, go to step 3.
- Determine the status of the disk units
in the array by doing the following steps:
- Access the product activity log and display the SRC that sent
you here.
- Press the F9 key for address information. This is the adapter
address.
- Return to the SST or DST main menu.
- Select .
- On the Display disk configuration status screen, look for the
devices attached to the adapter that was identified.
Is there a device that has a status of RAID 5/Unknown, RAID 6/Unknown,
RAID 5/Failed, or RAID 6/Failed?- No:
- Go to step 5.
- Yes:
- Go to step 4
- Other errors should have occurred related
to the disk array having degraded protection. Take action on these
errors to replace the failed disk unit and restore the disk array
to a fully protected state. This ends the procedure.
- Have other errors occurred at the same
time as this error?
- No:
- Go to step 7.
- Yes:
- Go to step 6
- Take action on the other errors that
have occurred at the same time as this error. This ends the procedure.
- Was the SRC xxxxFFFE?
- No:
- Go to step 10.
- Yes:
- Go to step 8.
- Check for the latest PTFs for the device,
device enclosure, and adapter and apply them. Did you find and apply
a PTF?
- No:
- Go to step 10.
- Yes:
- Go to step 9.
- This ends the procedure.
- Identify the adapter and adapter port
associated with the problem by examining the product activity log.
Perform the following steps:
- Access SST or DST.
- Access the product activity log and display the SRC that sent
you here. Record the adapter address and the adapter port by doing
one of the following:
- If the SRC is xxxxFFFE, press the F9 key for
address information. The adapter address is the bus, board, card
information. The port is shown in the I/O bus field. Convert the
port value from decimal to hexadecimal.
- Press the F9 key for address information. The adapter address
is the bus, board, card information. Then, press F12 to cancel and
return to the previous screen. Then press the F4 key to view the
additional information, if available. The adapter port is characters
1 and 2 of the unit address. For example, if the unit address is
123456FF, the port would be 12.
- Go to Hexadecimal
product activity log data to obtain the address information. The adapter
address is the bus, board, card information. The adapter port is
characters 1 and 2 of the unit address. For example, if the unit
address is 123456FF, the port would be 12.
- Perform the following steps:
- Select .
- Enter the adapter bus address and use the Associated
packaging resource(s) option to display the type, model,
and unit ID.
- Record the type, model, and unit ID of the enclosure in which
the adapter is located.
- Use the type, model, unit ID and adapter address to find the location
of the adapter (see Addresses to
find the location and then go to System FRU locations).
- The logical port number was identified in step 10. Logical port numbers are indicated
on the physical connector labels located on the tailstock of the adapter.
To locate the device or device enclosure that is experiencing the
problem, use the logical port number to determine the physical connector
to which the device or device enclosure is attached.
- Because the problem persists, some corrective
action is needed to resolve the problem. Proceed by doing the following:
Perform only one of the following corrective actions (listed
in the order of preference). If one of the corrective actions has
previously been attempted, proceed to the next one in the list.
- Does the problem still occur after performing the corrective
action?
- No: This ends the procedure.
- Yes: Go to step 12.