subscribe iconSubscribe to this information
POWER7 information

SIP3152

Use this procedure to resolve possible failed connection problems.

If you need additional information for failing part numbers, location codes, or removal and replacement procedures, see Part locations and location codes. Select your machine type and model number to find additional location codes, part numbers, or replacement procedures for your system.

This procedure is used to resolve the following problems:

The possible causes are:
  • A failed connection caused by a failing component in the serial-attached SCSI (SAS) fabric between, and including, the adapter and device enclosure.
  • A failed connection caused by a failing component within the device enclosure, including the device itself.
Note: For SRC xxxx4060, the failed connection was previously working, and might have already recovered.
Considerations:
Attention:
  • When SAS fabric problems exist, do not replace RAID adapters without assistance from your service provider. Because the adapter might contain nonvolatile, write-cache data and configuration data for the attached disk arrays, additional problems can be created by replacing an adapter.
  • Follow appropriate service procedures when replacing the cache RAID and dual IOA enablement card. Incorrect removal can result in data loss or a nondual storage IOA mode of operation.
  • Do not remove functioning disk units in a disk array without assistance from your service provider. A disk array might become unprotected or might fail if functioning disk units are removed. The removal of functioning disk units might also result in additional problems in the disk array.
  1. Determine the resource name of the adapter that reported the problem by performing the following steps:
    1. Access SST or DST.
    2. Access the product activity log and record the resource name that this error is logged against. If the resource name is an adapter resource name, use it and continue with the next step. If the resource name is a disk-unit resource name, use the Hardware Service Manager to determine the resource name of the adapter that is controlling this disk unit. The logical bus number of the disk-unit logical resource might be useful in determining the adapter resource name.
  2. Is the IBM® i operating system at Version 6.1.1 or later?
    • No: Continue with the next step.
    • Yes: Go to step 4.
  3. Determine whether a problem still exists for the adapter that logged this error by examining the SAS connections as follows:
    1. On the System Service Tools (SST) display, select Start a Service Tool and press Enter.
    2. Select Display/Alter/Dump > Display/Alter storage > Licensed Internal Code (LIC) data > Advanced Analysis.
    3. Type FABQUERY on the entry line and then select it with option 1.
    4. On the Specify Advanced Analysis Options display, type -SUB 01 -IOA DCxx -DSP 0 in the Options field, where DCxx is the adapter resource name. Press Enter.
      Note: More information is available by returning to the Specify Advanced Analysis Options display and typing -SUB 01 -IOA DCxx -DSP 2 in the Options field, where DCxx is the adapter resource name. Press Enter.
      Do all expected devices appear in the list and are all paths marked as Operational?
      • No: Go to step 5.
      • Yes: The error condition has been recovered. If the error condition has been recovered more than once, go to step 7. Otherwise, the error condition is not a persistent problem and no further service action is necessary. This ends the procedure.
  4. Determine whether a problem still exists for the DCxx adapter resource that logged this error by examining the SAS connections. See Viewing SAS fabric path information. Do all expected devices appear in the list and are all paths marked as Operational?
    • No: Continue with the next step.
    • Yes: The error condition has been recovered. If the error condition has been recovered more than once, go to step 7. Otherwise, the error condition is not a persistent problem and no further service action is necessary. This ends the procedure.
  5. Perform the following steps to cause the adapter to rediscover the devices and connections:
    1. Use the logical resources IO debug option in Hardware Service Manager to perform another IPL of the virtual I/O processor that is associated with this adapter.
    2. Vary on any other resources that are attached to the virtual I/O processor.
  6. To determine whether the problem still exists for the adapter that logged this error, examine the SAS connections by performing the actions in step 3 or step 4 again. Do all expected devices appear in the list and are all paths marked as Operational?
    • No: Continue with the next step.
    • Yes: The error condition no longer exists. This ends the procedure.
  7. Perform only one of the following corrective actions (listed in the order of preference). If one of the corrective actions has previously been attempted, proceed to the next one in the list.
    • Reseat cables, if present, on the adapter , device enclosure, and any additional device enclosures connected to the device enclosure. Perform the following steps:
      1. Using Hardware Service Manager packaging resources, perform adapter concurrent maintenance to power off the adapter slot, or power off the system or partition.
      2. Reseat the cables.
      3. Using Hardware Service Manager packaging resources, perform adapter concurrent maintenance to power on the adapter slot, or power on the system or partition.
    • Replace the cable, if present, from the adapter to device enclosure, and any cables between the device enclosure and additional device enclosures connected to the device enclosure. Perform the following steps:
      1. Using Hardware Service Manager packaging resources, perform adapter concurrent maintenance to power off the adapter slot, or power off the system or partition.
      2. Replace the cables.
      3. Using Hardware Service Manager packaging resources, perform adapter concurrent maintenance to power on the adapter slot, or power on the system or partition.
    • Replace the device.
      Note: If there are multiple devices with a path that is not Operational, the problem is not likely to be with a device.
    • Replace the internal device enclosure or see the service documentation for an external expansion unit. Perform the following steps:
      1. Power off the system or partition. If the enclosure is external, adapter concurrent maintenance can be used instead to power off the adapter slot.
      2. Replace the device-enclosure failing items. See SASEXP and DEVBPLN for possible failing items to replace.
      3. Power on the system or partition. If the enclosure is external, adapter concurrent maintenance can be used instead to power on the adapter slot.
    • Replace the adapter. For the procedure to replace the adapter, see PCI adapter.
    • Contact your service provider.
  8. To determine if the problem still exists for the adapter that logged this error, examine the SAS connections by performing the actions in step 3 or step 4 again. Do all expected devices appear in the list and are all paths marked as Operational?
    • No: Go to step 7.
    • Yes: The error condition has been recovered. If the error condition has been recovered more than once, go to step 7. Otherwise, the error condition is not a persistent problem and no further service action is necessary. This ends the procedure.


Send feedback Rate this page

Last updated: Thu, July 23, 2015