Removing and replacing an enclosure services manager in an ESLS storage enclosure

Learn how to remove and replace an enclosure services manager (ESM) in an ESLS storage enclosure (IBM EXP24SX SAS Storage Enclosure).

About this task

Attention: Failure to follow the steps sequentially for this field replaceable unit (FRU) removal or installation might result in damage to the FRU or system.
Use the following precautions whenever you handle electronic components or cables:
  • Attach a wrist strap to an unpainted metal surface of your hardware to prevent electrostatic discharge (ESD) from damaging your hardware.
  • If you do not have a wrist strap, before you remove the product from ESD packaging and installing or replacing hardware, touch an unpainted metal surface of the system for a minimum of 5 seconds.
  • Keep all electronic components in the shipping container or envelope until you are ready to install them.
  • If you remove and reinstall an electronic component, temporarily place the component on an ESD pad or blanket, if available.
  • During replacement with the system power turned off, do not combine the replacement of any ESM with the replacement of the midplane unless you cycled the power of the storage enclosure. You must replace only one new part at a time when the system power is off. If multiple parts are replaced at the same time when the system power is off, the serial number is not preserved.
Note: To prevent the loss of enclosure information, do not replace both ESMs at the same time when the storage enclosure is powered off. To replace the second ESM when the storage enclosure is powered off, replace the first ESM, then restore power to the storage enclosure. Wait three minutes. Then power off the storage enclosure to the replace the second ESM.

Procedure

  1. Determine whether the repair operation can continue when the system power is turned on. To continue the repair operation when the system power is turned on, the following conditions must be true:
    • A second ESM must already be installed and the amber LED on the second ESM is not turned on.
    • The amber fault light of the failing ESM is turned on solid, or you are using one of the following operating systems:
      • AIX® release 7.2.2 or later, and the controlling adapters have a code level of 17518200 or higher.
      • IBM® i with a minimum of V7R3 with MF99204 (TR4), or V7R2 with MF99108 (TR8), and the controlling adapters have PTF MF64136 for V7R3 or MF64117 for V7R2, or higher applied.
  2. If these conditions are false, continue the repair operation only after you power off the system that is hosting the enclosure that contains the part that is being repaired.
    Option Description
    Either condition is false. You must perform the repair operation when the system power is turned off.
    1. Power off the system or partition that is hosting the storage enclosure.
    2. Label and remove the power cords from both power supplies of the storage enclosure.
    3. Proceed to step 15.
    Note: To prevent the loss of enclosure information, do not replace both ESMs at the same time when the storage enclosure is powered off. To replace the second ESM when the storage enclosure is powered off, replace the first ESM, then restore power to the storage enclosure. Wait three minutes. Then power off the storage enclosure to the replace the second ESM.
    Both conditions are true. You can perform the repair operation when the system power is turned on. Proceed to step 3.
  3. Choose from the following options:
    • If the amber fault light of the failing ESM is turned on solid, continue with step 15.
    • If the amber fault light of the failing ESM is not turned on solid and if you are using the AIX operating system to replace the part, continue with step 4.
    • If the amber fault light of the failing ESM is not turned on solid and if you are using the IBM i operating system to replace the part, continue with step 8.
  4. If you are using the AIX operating system, complete the following steps to prepare the ESM for removal:
    1. Log in to the AIX operating system as root user, or CE login.
    2. Enter the diag command to load the diagnostic controller, and to display the online diagnostic menu options.
    3. If requested, enter a password.
    4. When the Diagnostic Operating Instructions screen is shown, press Enter.
    5. From the Function Selection menu, select Task Selection > Display Previous Diagnostic Results > Display Diagnostic Log Summary. A chronological list of events is shown.
    6. Review the T column for the most recent S entry for the ESM (logical SES device and resource name that has the sesX) that you are servicing.
    7. Select the row with the most recent S entry in the table and press Enter.
    8. Select Commit. The details of this log entry are shown.
    9. Record the sesX resource name and the location code of the failing ESM.
    10. Press F3 or Esc+3 three times to return to the Tasks Selection lists.
      Attention: The ESM to be swapped may be attached to multiple SAS controllers. These SAS controllers may be in the same partition, or split between multiple partitions on the same or separate systems. The configuration and location of all SAS controllers attached to this ESM must be understood to safely remove it. The customer may have to provide this configuration information.
      Attention: The ESM may be partitioned in a mode that this service procedure does not support. The partitioning mode of the ESM can be found on a label on the rear of the storage inclosure and can also be displayed in the sesX device's Vital Product Data.
    11. In the Task Selection list, scroll down to Display Hardware Vital Product Data and press Enter.
    12. In the Resource Selection List, scroll down to the sesX device recorded in step 4.i. Select it using the Enter key, then press F7 to commit. A sample VPD is shown as follows:
      
      Display Vital Product Data (VPD)
        ses2             UESLS.001.G63X003-P1-C2  SAS Enclosure Services Device
            SAS Expander:
              FRU Number..................01DH720
              Serial Number...............YL30nnnnnnnnn
              Customer Card ID Number.....xxxx
              Product Specific.(ZM).......2
              Load ID.....................A17nnnnn
              ROM Level.(alterable).......41C0
              Hardware Location Code......UESLS.001.G63X003-P1-C2
      
      
        PLATFORM SPECIFIC
      
        Name:  disk
          Node:  disk
          Device Type:  block
      
      
      To continue, press Enter.
      
      Esc+3=Cancel        F10=Exit             Enter
      

      The value in the Product Specific.(ZM) field is the ESM partitioning mode. If this value is 4, this procedure must be performed when the enclosure storage is powered off. If this value is 1 or 2, SAS adapters might exist in other partitions or systems; in this case, the customer will have to provide configuration information.

    13. Press F3 or Esc+3 three times to return to the Tasks Selection lists.
  5. Use the AIX diagnostics to identify the ESM to be removed:
    1. Page down and select Identify and Attention Indicators.
    2. Page down and select the location code that was recorded in step 4.i. Press Enter, then press F7 to turn the indicators on.

      The blue identify LEDs on the drawer and both ESMs turn blue, the amber fault LED flashes on the selected ESM (at the rear of the enclosure).

    3. View the LEDs to locate the ESM that must be replaced.
      Notes:
      • Ensure that the correct ESM is selected and that the fault LED on the other ESM in the same storage enclosure is not turned on solid.
      • If the correct ESM is selected, but the other ESM in the same storage enclosure has a fault LED turned on solid, the procedure must be completed when the system power is turned off. Return to step 2 to complete the procedure with the system power turned off.
    4. Press F3 or Esc+3 to return to the Tasks Selection list.
  6. Suspend the ESM by using the AIX operating system:
    1. From the Tasks Selection list, select Hot Plug Task > Expander Suspend and Resume Manager.
    2. Select the sesX entry with the location code of the ESM that you are replacing and press Enter.

      A warning message is shown. Select Yes and press Enter. The sesX device is suspended. If the sesX device is in mode 2, a second entry exists with the same location code but a different sesX device number; suspend that device as well.

    3. If you have a dual adapter configuration, repeat step 6.b for the other adapter.
      Attention: The other adapter may be in another partition or another system.
    4. Ensure that all of the sesX devices show a status of Suspended (in both of the adapter resources if this is a dual controller configuration). Note the sesX device names of the suspended devices.
    5. Press F3 or Esc+3 to return to the Tasks Selection list.
    6. Select Hot Plug Task > PCI Hot Plug Manager > Unconfigure a Device.
    7. Enter the sesX device in the device name field. Press the Down Arrow key to move to Unconfigure any Child Devices, then press the Tab key to change the value to yes. Press Enter perform the operation. A sasdrawerX device may also be unconfigured. Repeat this step for all of the sesX devices noted in step 6.d.
  7. Continue with step 15 to remove the ESM.
  8. Use the IBM i operating system to determine the location of the ESM assembly in the storage enclosure by completing steps 9 - 14.

    Use one of the following tables to record resource and bus information, depending on the SAS controller and ESM disk enclosure configuration.

    Table 1. Resource names of the ESM, IOAs, and system bus numbers for multiple adapter configurations
    Path 1 2
    Resource names of ESM to be replaced (step 10.c) Dxx Dxx
    System Bus Number in decimal (recorded in step 10.d) xxx xxx
    Resource Names of the IOAs that are attached to the ESM (recorded in step 12.d) DCxx DCxx
    IOA Operating Mode (recorded in step 12.g). Secondary Primary
    Table 2. Resource names of the ESM, IOAs, and system bus numbers for single adapter configurations
    Path 1
    Resource names of ESM to be replaced (step 10.c) Dxx
    System Bus Number in decimal (recorded in step 10.d) xxx
    Resource Names of the IOAs that are attached to the ESM (recorded in step 12.d) DCxx
    IOA Operating Mode (recorded in step 12.g). Standalone
  9. Start the IBM i Hardware service manager:
    1. On the command line of the main menu, type strsst and press Enter.
    2. Type the service tools user ID and password and press Enter.
    3. Select Start a service tool > Hardware service manager.
  10. Determine the resource name and system bus number of the ESM to be replaced by using the IBM i Hardware service manager:
    1. Select Packaging hardware resources (system, frames, cards) > Hardware contained within package for the storage enclosure that contains the part you want to replace.
    2. Select Associated logical resources for the Device Services (ESM) that contains the location of the ESM to be replaced.
    3. For each Device Services Logical Resource shown, record the resource name and status of the Device Services (ESM). This resource represents the ESM to be replaced. The device service can show a status other than Operational. See Table 1 or Table 2 for an example table of what you need to record.
    4. When Detail is displayed on each Device Services Logical Resources, record the System bus number of each device.
    5. Use the Cancel key to return to the Packaging Hardware Resources window for the storage enclosure that you selected.
  11. Check that the redundant ESM is operational by checking the resource names and status of the redundant ESM by using the IBM i operating system:
    1. Select Associated logical resources for the other (redundant) Device Services (ESM).
    2. Check the status of each Device Services (ESM) shown. Each resource represents the redundant ESM. Verify that each of these Device Services has a status of Operational. If the status or statuses are not Operational, this procedure cannot be completed when the system power is turned on.
    3. When Detail is displayed on each resource name, record the System Bus number of each device. These bus numbers must match the bus numbers that were recorded in step 10.d. Use the Cancel function to return to the Logical Resources Associated with a Packaging Resource screen.
    4. Press Exit to return to the Hardware Service Manager menu.
  12. Determine the resource names of the IOAs to which the ESM is attached by using the IBM i operating system:
    1. Select Logical hardware resources (buses, IOPs, controllers) > System bus resources.
    2. In the System buses to work with field, enter the first bus number that you recorded in step 10.d and press Enter. The results that are shown are for the bus number that you entered.
    3. Select Resources associated with IOP for the Virtual IOP listed.
    4. Find the storage IOA with a resource name similar to "DCxx" and select Display detail. Record the Resource name associated with the bus number you entered in step 12.b and determine whether the configuration is dual adapter configuration (Primary or Secondary Storage IOA) or a single adapter configuration (Standalone Storage IOA). This information is needed in step 12.e. See Table 1 or Table 2.
    5. Choose from the following options:
      • If you have a single adapter configuration, you can suspend the ESM. Continue with step 13.
      • If you have a dual adapter configuration, repeat steps 12.b - 12.d for the other adapter. Then, continue with 12.f.
    6. On the Auxiliary Storage Hardware Resource Detail screen, select the function for Dual Storage IOA Configuration. Two IOAs should be listed on the display.
    7. For each listed IOA, record the operating mode and verify that the status is Operational.
      Note: If the status of both IOAs are not Operational, this procedure must be completed when the system power is turned off. Return to step 2 to complete the procedure with system power turned off.
    8. Select Exit twice to return to the SST display.
  13. Suspend the ESM that you want to replace by using the IBM i operating system:
    1. Select Start a service tool > Display/Alter/Dump > Display/Alter storage > Licensed Internal Code (LIC) data.
    2. Scroll down and select Advanced analysis, type 1 in the Option field and type IOASES on the command line and press Enter.
    3. On the Specify Advanced Analysis screen, type the following command:
      -ioa xxxx -ses yyy -suspend

      where xxxx is the resource name of the Primary IOA recorded in step 12.d, and yyy is the resource name of the Device Services corresponding to the Primary IOA recorded in step 10.c.

    4. Review the results:
      • Several Device Services (ESM) devices might be listed because the IOA might have other connections.
      • Only the Device Services (ESM) specified in step 13.c must have a status of Suspended. That is, at most one device must be suspended.
      • If success is not indicated, do not continue with removing the ESM when the system power is turned on. Contact your next level of support.

        If you see code 78D13002 in the product activity log, that is normal and does not indicate a need for further support.

    5. Record the Serial Number of the Suspended Device Services Resource and press Enter.
    6. If you have a dual adapter configuration, repeat steps 13.c - 13.e with the Secondary IOA resource name and its corresponding Device Services resource you recorded in Table 1. If you have a single adapter configuration, continue with step 13.h.
    7. Verify that the Suspended Device Services (ESMs) that were suspended have the same Serial Numbers that you recorded in 13.e each time.
    8. Select Exit twice to return to the SST display.
  14. Turn on the LEDs to identify the ESM to remove by using the IBM i operating system:
    1. Select Start a service tool.
    2. Select Hardware service manager > Packaging hardware resources (system, frames, cards).
    3. Select Hardware contained within package for the storage enclosure where the ESM is located.
    4. Select Associated logical resources for the Device Services (ESM) that contains the location of the ESM to be replaced and press Enter. Verify that the Device Services (ESM or ESMs) have a status of Failed.
    5. Select Associated packaging resources for the Device Services (ESM). If you have a dual adapter configuration, select one of the logical resources.
    6. Select Concurrent Maintenance for the ESM Device Services resource.
    7. Select Toggle identify indicator state for the resource.
      Notes:
      • The blue enclosure identify LED at the front and back of the storage enclosure turns on solid to identify the enclosure.
      • A fast-blinking amber LED at the rear of the ESM indicates that the slot is identified. Physically verify that the identified slot is where you want to remove the ESM assembly.
  15. Complete the following steps to remove the ESM:
    1. Noting their locations, label and disconnect the serial-attached SCSI (SAS) cables from the ESM.
      Attention: Incorrect cable placement might result in data loss.
    2. Open the two release levers (A) as shown in Figure 1.
    3. Support both sides of the ESM while you slide it out of the enclosure.
      Figure 1. Removing an ESM from the disk drive enclosure
      Removing an ESM from the disk drive enclosure
    4. Choose from the following options:
      • If you removed the ESM when the system power was turned off, continue with step 16.
      • If you used the AIX operating system to remove the ESM when the system power was turned on, continue with step 16.
      • If you used the IBM i operating system to remove the ESM when the system power was turned on, select Exit twice to return to the SST screen. Then, continue with step 16.
  16. To install the ESM, complete the following steps:
    1. Check that the connector pins at the front of the ESM are not bent and that no pins are damaged.
    2. Ensure that the release levers on the new ESM are in the open position.
    3. Slide the ESM gently into the enclosure until the ESM stops.
    4. Push the release levers to the closed position.
    5. Remove any SAS protectors from the old ESM and insert them into their corresponding locations in the replacement ESM.
    6. Reconnect the SAS cables to the ESM by using your labels.
      Attention: Incorrect cable placement might result in data loss.
    7. Choose from the following options:
      • If you installed the ESM when the system power was turned off, continue with step 17.
      • If you installed the ESM when the system power was turned on and you are using the AIX operating system, continue with step 18.
      • If you installed the ESM when the system power was turned on and you are using the IBM i operating system, continue with step 19.
  17. If you replaced the part when the system power was turned off, restore power to the power supplies of the storage enclosure and power on the system or partition.
    This ends the procedure.
  18. If you replaced the part by using the AIX operating system, complete the following steps:
    1. From the Tasks Selection list, select Hot Plug Task.
      Note: Wait for 1 minute after the LEDs on the newly-installed ESM come on, to ensure that the configuration data is updated.
    2. Select PCI Hot Plug Manager > Install/Configure Devices Added After IPL.
    3. Press Enter on the Install/Configure Devices Added After IPL screen. Wait for 1 minute to ensure that the configuration data is updated after the operation is complete.
    4. Press F3 or Esc+3 three times to return to the tasks list.
    5. Select Hot Plug Task > Expander Suspend and Resume Manager and complete the following steps:
      1. Select the sesX entry for the ESM you replaced, and press Enter. Repeat this step for all of the sesX devices associated with the location code of the ESM device that were suspended in step 6.d.
      2. If you have a dual adapter configuration, select the sesX entry under the second adapter and press Enter. Repeat this step for all of the sesX devices associated with the location code of the ESM device that were suspended in step 6.d.
      3. Ensure that all sesX entries show a status of Active.
    6. Press F3 or Esc+3 to return to the Tasks Selection list.
    7. Scroll down, select Identify and Attention Indicators, and press Enter.
    8. Scroll down to the location code of the ESM that was replaced.
      • If a capital I is present, press Enter, then press F7 to commit and turn off the identify LEDs. Press F3 or Esc+3 to return to the Tasks Selection List. This ends the procedure.
      • If no capital I is present, press F3 or Esc+3 to return to the Tasks Selection List. This ends the procedure.
  19. If you replaced the part by using the IBM i operating system, activate the IOA:
    1. Wait for at least 1 minute.
    2. From the IBM i SST display, select Start a service tool > Display/Alter/Dump > Display/Alter storage > Licensed Internal Code (LIC) data.
    3. Scroll down and select Advanced analysis, type 1 in the Option field and type IOASES on the command line and press Enter.
    4. In the Specify Advanced Analysis Options display, type
      -ioa xxxx -ses yyy -resume

      where xxxx is the resource name of the Primary IOA recorded in step 12.g, and yyy is the resource name of the Device Services recorded in step 10.c.

    5. Review the results of the Advanced Analysis IOA SES invocation.
      • Several Device Services (ESM) devices might be listed because the IOA might have other connections.
      • All Device Services (ESM) resources must show a status of Active.
      • If success is not indicated, contact your next level of support for assistance.
    6. Press Enter.
    7. If you have a dual adapter configuration, repeat steps 19.b - 19.e with the other path that you recorded in Table 1 or Table 2. If you have a single adapter configuration, continue with step 19.k.
    8. In the Specify Advanced Analysis Options display, type
      -ioa xxxx -ses yyy -reset

      where xxxx is the resource name of the Primary IOA recorded in step 12.g, and yyy is the resource name of the Device Services recorded in step 10.c.

    9. Review the results of the Advanced Analysis IOA SES invocation.
      • Several Device Services (ESM) devices might be listed because the IOA might have other connections.
      • All Device Services (ESM) resources must show a status of Active.
      • If success is not indicated, contact your next level of support for assistance.
    10. Press Enter.
    11. Select Exit twice to return to the SST display.
  20. To verify that the SES resource is active for each IOA, complete the following steps by using the IBM i operating system:
    1. From the SST display, select Start a service tool.
    2. Select Hardware service manager > Logical hardware resources (buses, IOPs, controllers) > System bus resources.
    3. In the System bus(es) to work with field, enter the first bus number that you recorded in step 10.d and press Enter. The results that are shown are for the bus number that you entered.
    4. Select Resources associated with IOP for the listed Virtual IOP.
  21. Choose from the following options:
    • If you have a single adapter configuration, continue with step 23.
    • If you have a dual adapter configuration, continue with the next step.
  22. If you have a dual adapter configuration, complete the following steps by using the IBM i operating system:
    1. Find the Storage IOA with a resource name such as DCxx and select Display detail for that resource.
    2. On the Detail display, select Dual Storage IOA Configuration. Two IOAs are shown.
    3. For each listed IOA, verify that the status is Operational. If the status of both IOAs are not Operational, the procedure might have failed. Contact your next level of support.
    4. Select Resources associated with controlling IOP for the Secondary Storage IOA.
    5. Find the Device Services (SES) entries that you recorded during the remove procedure. Verify that both devices are operational. If either is not operational, the procedure might have failed. Contact your next level of support.
    6. Select Cancel to return to the previous screen.
    7. Select Resources associated with controlling IOP for the Primary Storage IOA.
  23. Find the Device Services (SES) entries that were recorded during the removal procedure. Verify that both EMSs / Device Services Resources are operational. If either is not operational, the procedure might have failed. Contact your next level of support.
  24. Select Exit twice to return to the SST display. This ends the procedure.