Use this procedure to complete serial attached SCSI (SAS) fabric problem
isolation.
Before you begin
Considerations:
- Power off the system, partition, or card slot before you connect and disconnect cables or
devices, as appropriate, to prevent hardware damage.
- Some systems have a disk enclosure or removable media enclosure that is integrated in the system
with no cables. For these configurations, the SAS connections are integrated onto the system boards
and a failed connection can be the result of a failed system board or integrated device
enclosure.
- Some systems have SAS RAID adapters that are integrated onto the system backplane and use a
cache RAID and dual IOA enablement card to enable storage adapter write cache and dual storage I/O
adapter (IOA) mode. For these configurations, replacement of the cache RAID and dual IOA enablement
card is unlikely to solve a SAS-related problem because the SAS interface logic is on the system
backplane.
Attention: When SAS fabric problems exist, obtain
assistance from your hardware service provider:
- When SAS fabric problems exist, do not replace RAID adapters without
assistance from your service provider. Because the adapter might contain
nonvolatile write cache data and configuration data for the attached
disk arrays, additional problems can be created by replacing an adapter.
- Follow appropriate service procedures when you replace the Cache RAID and dual IOA enablement
card. Incorrect removal can result in data loss or a nondual storage IOA mode of operation.
- Do not remove functioning disk units in a disk array without assistance
from your service provider. A disk array might become unprotected
or might fail if functioning disk units are removed. The removal of
functioning disk units might also result in additional problems in
the disk array.
Procedure
- Was the SRC xxxx3020 or SRC xxxx8130?
- No:
- Go to step 3.
- Yes:
- Go to step 2.
-
Determine which of the following problems is the cause of your specific error and take the
appropriate actions listed.
The possible causes for SRC
xxxx3020 are:
- More devices are connected to the adapter than the adapter supports. Change the configuration to
reduce the number of devices below what is supported by the adapter.
- A SAS device was incorrectly moved from one location to another. Either return the device to its
original location or move the device while the adapter is powered off.
- A SAS device was incorrectly replaced by a SATA device. A SAS device must be used to replace a
SAS device.
The possible causes for SRC
xxxx8130 are:
- One or more SAS devices were moved from a PCIe3 adapter to a PCIe adapter. If the device was
moved from a PCIe3 adapter to a PCIe adapter, the Detail Data section of the hardware error log
contains a reason for failure of Payload CRC Error. For this case, the error can be
ignored and the problem is resolved if the devices are moved back to a PCIe3 adapter or if the
devices are formatted on the PCIe adapter.
- For all other causes, go to step 3.
- Determine the status of the disk units
in the array by doing the following steps:
- Access the product activity log and display the SRC that sent
you here.
- Press the F9 key for address information. This is the adapter
address.
- Return to the SST or DST main menu.
- Select .
- On the Display disk configuration status screen, look for the devices that are attached to the
adapter that was identified.
Is there a device that has a status of RAID 5/Unknown, RAID 6/Unknown,
RAID 5/Failed, or RAID 6/Failed?
- No:
- Go to step 5.
- Yes:
- Go to step 4.
-
Other errors might have occurred that are related to the disk array having degraded protection.
Take action on these errors to replace the failed disk unit and restore the disk array to a fully
protected state. This ends the procedure.
- Have other errors occurred at the same
time as this error?
- No:
- Go to step 7.
- Yes:
- Go to step 6.
-
Take action on the other errors that occurred at the same time as this error. This
ends the procedure.
- Was the SRC xxxxFFFE?
- No:
- Go to step 10.
- Yes:
- Go to step 8.
-
Check for the latest PTFs for the device, device enclosure, and adapter and apply them. If you
need assistance finding PTFs, contact your next level of support. Did you find and apply a
PTF?
- No:
- Go to step 10.
- Yes:
- Go to step 9.
-
This ends the procedure.
-
Identify the adapter and adapter port that is associated with the problem by examining the
product activity log. Perform the following steps:
- Access SST or DST.
- Access the product activity log and display the SRC that sent you here. Record the adapter
address and the adapter port by completing one of the following actions:
- If the SRC is xxxxFFFE, press the F9 key for address information. The adapter
address is the bus information. The port is shown in the I/O bus field. Convert the port value from
decimal to hexadecimal.
- Press the F9 key for address information. The adapter address is the bus information. Then,
press F12 to cancel and return to the previous screen. Then, press the F4 key to view the additional
information, if available. This information is the unit address. Go to SAS address and
physical location information and use the
unit address to determine the controller port.
- Go to Hexadecimal product activity log data to obtain the address information. The
adapter address is the bus information. The controller port is contained in the unit address. Go to
SAS address and physical location information and use the
unit address to determine the controller port.
- Perform the following steps:
- Select .
- Enter the adapter bus address and use the Associated
packaging resource(s) option to display the type, model,
and unit ID.
- Record the type, model, and unit ID of the enclosure in which
the adapter is located.
- Use the type, model, unit ID and adapter address to find the location
of the adapter (see Addresses to find the location
and then go to Part locations and location codes).
- The logical port number was identified in step 10. Logical
port numbers are indicated on the physical connector labels that are located on the tailstock of the
adapter. To locate the device or device enclosure that is experiencing the problem, use the logical
port number to determine the physical connector to which the device or device enclosure is
attached.
-
Because the problem persists, some corrective action is needed to resolve the problem.
Perform only one of the following corrective actions (listed in the order of preference). If one
of the corrective actions was previously attempted, proceed to the next one in the list.
-
Does the problem still occur after you completed the corrective action?
- No:
This ends the procedure.
- Yes: Go to step 12.