MAP 4050

Use this MAP to perform SAS fabric problem isolation.

For more information about failing part numbers, location codes, or removal and replacement procedures, see Part locations and location codes. Select your machine type and model number to see applicable procedures for your system.

Considerations:

  • Remove power from the system before connecting and disconnecting cables or devices, as appropriate, to prevent hardware damage or erroneous diagnostic results.
  • Some systems have SAS and PCI-X or PCIe bus interface logic integrated onto the system boards and use a pluggable RAID enablement card (a non-PCI form factor card) for these integrated-logic buses. See the feature comparison tables for PCIe and PCI-X cards. For these configurations, replacement of the RAID enablement card is unlikely to solve a SAS-related problem because the SAS interface logic is on the system board.
  • Some systems have the disk enclosure or removable media enclosure integrated in the system with no cables. For these configurations, the SAS connections are integrated onto the system boards. A failed connection can be the result of a failed system board or integrated device enclosure.
Attention: When SAS fabric problems exist, obtain assistance from your hardware service provider before performing any of the following actions:
  • Before you replace a RAID adapter: Because the adapter might contain nonvolatile write cache data and configuration data for the attached disk arrays, additional problems can be created by replacing an adapter.
  • Before you remove functioning disks in a disk array: The disk array might become degraded or failed and additional problems might be created if functioning disks are removed from a disk array.
Attention: Do not remove functioning disks in a disk array without assistance from your hardware service support organization. A disk array might become degraded or might fail if functioning disks are removed, and additional problems might be created.

Step 4050-1

Was the SRN nnnn-3020?

No
Go to Step 4050-3.
Yes
Go to Step 4050-2.

Step 4050-2

The possible causes are:

  • More devices are connected to the adapter than the adapter supports. Change the configuration to the allowable number of devices.
  • A SAS device has been improperly moved from one location to another. Either return the device to its original location or move the device while the adapter is powered off or unconfigured.
  • A SAS device has been improperly replaced by a SATA device. A SAS device must be used to replace a SAS device.

When the problem is resolved, see the removal and replacement procedures topic for the system unit on which you are working and do the "Verifying the repair" procedure.

Step 4050-3

Determine if any of the disk arrays on the adapter are in a Degraded state as follows:

  1. Start the IBM® SAS Disk Array Manager.
    1. Start Diagnostics and select Task Selection on the Function Selection display.
    2. Select RAID Array Manager.
    3. Select IBM SAS Disk Array Manager.
  2. Select List SAS Disk Array Configuration.
  3. Select the IBM SAS RAID Controller identified in the hardware error log.

Does any disk array have a state of Degraded?

No
Go to Step 4050-5.
Yes
Go to Step 4050-4.

Step 4050-4

Other errors might have occurred related to the disk array being in a Degraded state. Take action on these errors to replace the failed disk and restore the disk array to an Optimal state.

When the problem is resolved, see the removal and replacement procedures topic for the system unit on which you are working and do the "Verifying the repair" procedure.

Step 4050-5

Have other errors occurred at the same time as this error?

No
Go to Step 4050-7.
Yes
Go to Step 4050-6.

Step 4050-6

Take action on the other errors that have occurred at the same time as this error.

When the problem is resolved, see the removal and replacement procedures topic for the system unit on which you are working and do the "Verifying the repair" procedure.

Step 4050-7

Was the SRN nnnn-FFFE?

No
Go to Step 4050-10.
Yes
Go to Step 4050-8.

Step 4050-8

Ensure device, device enclosure, and adapter microcode levels are up to date.

Did you update to newer microcode levels?

No
Go to Step 4050-10.
Yes
Go to Step 4050-9.

Step 4050-9

When the problem is resolved, see the removal and replacement procedures topic for the system unit on which you are working and do the "Verifying the repair" procedure.

Step 4050-10

Is the problem in a disk expansion unit?

No
Go to SAS fabric identification.
Yes
Go to Step 4050-11.

Step 4050-11

Identify the adapter SAS port that is associated with the problem by examining the hardware error log. The hardware error log might be viewed as follows:

  1. Follow the steps in Examining the hardware error log and return here.
  2. Select the hardware error log to view. In the hardware error log under the Disk Information heading, the Resource field can be used to identify which controller port the error is associated with.
Note: If you do not see the Disk Information heading in the error log, obtain the Resource field from the Detail Data / PROBLEM DATA section as illustrated in the following example:
Detail Data
PROBLEM DATA
0000 0800 0004 FFFF 0000 0000 0000 0000 0000 0000 1910 00F0 0408 0100 0101 0000
          ^
          |
      Resource is 0004FFFF
Go to Step 4050-12.

Step 4050-12

Using the resource found in the previous step, see SAS resource locations to understand how to identify the port of the controller to which the device, or device enclosure, is attached.

For example, if the resource were equal to 0004FFFF, port 04 on the adapter is used to attach the device or device enclosure that is experiencing the problem.

The resource found in the previous step can also be used to identify the device. To identify the device, you can attempt to match the resource with the one found on the display, that is displayed by performing the following steps.
  1. Start the IBM SAS Disk Array Manager:
    1. Start the diagnostics program and select Task Selection from the Function Selection display.
    2. Select RAID Array Manager.
    3. Select IBM SAS Disk Array Manager.
  2. Select Diagnostics and Recovery Options.
  3. Select Show SAS Controller Physical Resources.
  4. Select Show Physical Resource Locations.

Step 4050-13

Because the problem persists, some corrective action is needed to resolve the problem. Using the port or device information found in the previous step, proceed by doing the following steps.

  1. Power off the system or logical partition.
  2. Perform only one of the following corrective actions, which are listed in the order of preference. If one of the corrective actions has previously been attempted, proceed to the next one in the list.
    Note: Before replacing parts, consider using a complete power off of the entire system, including any external device enclosures, to provide a reset of all possible failing components. This action might correct the problem without replacing parts.
    • Reseat cables on adapter and device enclosure.
    • Replace cable from adapter to device enclosure.
    • Replace the device.
      Note: If there are multiple devices with a path that is not Operational, the problem is not likely to be with a device.
    • Replace the internal device enclosure or see the service documentation for an external expansion unit.
    • Replace the adapter.
    • Contact your hardware service provider.
  3. Power on the system or logical partition.
    Note: In some situations, it might be acceptable to unconfigure and reconfigure the adapter instead of powering off and powering on the system or logical partition.

Step 4050-14

Does the problem still occur after performing the corrective action?

No
Go to Step 4050-15.
Yes
Go to Step 4050-13.

Step 4050-15

When the problem is resolved, see the removal and replacement procedures topic for the system unit on which you are working and do the "Verifying the repair" procedure.




Last updated: Wed, June 19, 2019