Upgrading memory in the nodes of an IBM Storage Fusion HCI System System

Use these installation instructions to upgrade the compute-only nodes (9155-C00 and 9155-C04) and/or compute-storage nodes (9155-C01 and 9155-C05) of a IBM Storage Fusion HCI System system by installing the additional memory modules (AHJK or AHJN) or replacement modules (RPQ 8S1881). The 9155-C00 and 9155-C01 can be factory upgraded to 512GB of RAM and field upgraded to either 512GB of RAM or 1024GB of RAM. The two memory choices for 9155-C04 are 16 GB/core and 32GB/core (1024GB and 2048GB) per 64-core server. The same instructions are printed and attached to the hardware.

Before you begin

  • Installing the memory upgrade requires temporarily shutting down each of the compute nodes that are being upgraded. Ensure all the nodes in the IBM Storage Fusion HCI System system are functioning properly so that the system can tolerate the shutting down of compute nodes without impact on running applications or the storage cluster.
  • The system must have sufficient capacity to run all applications with one compute node shutting down. If not, close some applications until the upgrade is complete.
  • Before you begin the task of installing the memory upgrade, identify the compute nodes in the rack enclosure to be upgraded. Ensure that each compute node has enough DIMMs for the upgrade. Only 9155-C00, 9155-C01, 9155-C04, 9155-C05, 9155-C10, and 9155-C14 compute nodes are supported by the memory upgrade. The 9155-G01 GPU nodes and the 9155-F01 AFM nodes do not support this memory upgrade. It is likely that more than one compute node in the IBM Storage Fusion HCI System system is being upgraded. Because each compute node that is upgraded must be powered off before the update. Also, it is important that only one compute node is upgraded at a time to maintain the integrity of the IBM Storage Scale ECE storage cluster and the OpenShift® Container Platform control plane. The only exception is when the entire system is powered down to perform upgrades.
  • A Philips screwdriver is used to turn the locking screw on the cover latches of the compute-only nodes (9155-C10, 9155-C00, 9155-C04) and storage-compute nodes (9155-C14, 9155-C01, 9155-C05).
  • Observe all normal safety precautions. Refer to the safety notices provided in the feature kit.
  • Before powering off the compute nodes, move the node to maintenance mode. For the steps to move to maintenance mode, see Administering the node.
  • On the node details page and on the Overview tab, view the health of events, disks, and ports. The green (normal) tick mark indicates that the status is healthy. If a red or yellow indicator appears, do not proceed. You must correct the error or warning to safely continue with the upgrade.
    Note: Note the location for easy finding of the compute node in the rack in future steps.

About this task

For actual memory expansion orders, contact IBM Support. Follow the procedure to install the memory upgrade.

Procedure

After successfully placing a compute node in maintenance mode follow the steps to upgrade.

  1. Power down the compute node. For the steps to move to power down, see Administering the node.
  2. Move the compute node into position for the upgrade.
    1. Open the rear door of the IBM Storage Fusion HCI System system and identify the compute node that was shut down.
    2. Check that the four optical Ethernet network cables (2x 100GbE and 2x 25GbE), two RJ45 copper Ethernet cables (one in an OCP port, the other in the IMM), and the two power cables have correct labels. If the labels are damaged or missing, add replacement labels that indicate the compute node and port where the cable is attached. When all cables have the correct labels, disconnect the cables from the compute node. Be careful not to dislodge any cables or power cords from any other component in the system during this process.
      Note: It is possible that there are no 25GbE optical network cables attached to the compute node.
      Figure 1. Compute node cables location (Rear side)
      Location of cables on the rear of the compute node
    3. Move to the front of the system, open the door and locate the powered down compute node.
    4. Unlatch the catches on the sides of the compute node that hold the rails, and then pull the compute node forward until the rails are fully extended.
    5. Remove the compute node from the rails and then place the compute node securely on the workbench. To remove the compute node from the rails, follow the instructions that are shown in the “System Removal” illustration as a guide.
      Note: It is not necessary to remove the compute node fully from the rack f upgrade, but you can choose to do so if it is more convenient.
      Figure 2. System removal procedure illustration
      System removal procedure illustration
  3. Remove the compute node top cover as follows:
    Note: Ensure you wear an ESD wrist strap, while removing the compute node top cover.
    1. Turn the lock screw on the cover latch to the open position.
    2. Press the blue button.
    3. Lift the cover latch.
    4. Slide the cover toward the back of the compute node until it detaches from the chassis, remove it and place the cover in a safe place.
      Figure 3. Compute node with cover latch lifted and cover slide back slightly
      Compute node with cover latch lifted and cover slide back slightly
      To remove the compute node top cover, see https://www.youtube.com/watch?v=Kxk8gZkU6wI.
  4. Install the Feature Code AHJK or AHJN memory module as follows:
    The FC AHJK or AHJN memory upgrade requires only adding more DIMMs to the compute node. The compute node already has 16 x16GB DIMMs installed. At the end of the installation, ensure that all 32 memory slots of the compute node are filled with 16GB DIMMs.
    To open the server and add more memory, six network cables (two Ethernet RJ45 cables, two 25G split cables and two100G fiber cables) and two power cords must be removed. After memory chips are added, these eight cables must be rewired:
    • 25G ports- split cables have to be firmly pressed in order for them to be securely seated to the cage. If the connection become loose, then the port does not work.
    • Handle 100G fiber cables with great care as the fiber cables bend and break easily. Take care when you work with existing memory chips as it may get loose and wires inside the server may get disturbed. Some times, one has to use force to install the memory chip that can disturb near by wires as well.
    The following picture showing 25 G ports and 100 G ports:
    Figure 4. 25 G ports and 100 G ports
    Picture showing 25 Gig ports and 100 Gig ports
    1. Locate the components to be installed. You need 16 of the 16GB DIMMs for each of the compute nodes to be upgraded.
    2. Verify that 16 DIMMs are already in the compute node and occupy the following slots as follows: 1, 3, 5, 7, 10, 12, 14, 16, 17, 19, 21, 23, 26, 28, 30, 32
    3. Remove the air baffle as shown in figure 5 and 6.
      Note: Ensure you wear an ESD wrist strap, while removing the air baffle.
      Figure 5. 9155-C00 or 9155-C01 with air baffle in place
      9155-C00 or 9155-C01 with air baffle in place
      Figure 6. 9155-C00 or 9155-C01 with air baffle removed
      9155-C14, 9155-C00 or 9155-C01 with air baffle removed
    4. Add 16 more 16GB DIMMs to the empty slots in the following slot order: 13, 29, 15, 31, 4, 20, 2, 18, 9, 25, 11, 27, 8, 24, 6, 22.
      For each slot, open the retaining clips at each end of the slot and then firmly press the new DIMM at both ends down into the slot, causing the tabs to snap to the locked position.
      Refer to the DIMM installation order label shown in the figure.
      Figure 7. DIMM Installation order taken from the label on the cover of a 9155-C00 or9155-C01 compute node
      DIMM Installation order taken from the label on the cover of a 9155-C00 or 9155-C01 compute node
      To install a memory module, see https://www.youtube.com/watch?v=_N2LLsHk7lI.
    5. After adding the 16 DIMMs, verify that all DIMM slots are now populated.
    6. Install the air baffle.
  5. Install the RPQ 8S1881 memory module as follows:
    The RPQ 8S1881 memory upgrade removes all 16 of the 16GB DIMMs from the compute node and replace them with 16 x 64GB DIMMs in the same slots where the 16GB DIMMs removed.
    1. Locate the components to be installed. You need 16 of the 64GB DIMMs for each of the compute nodes to be upgraded.
    2. Verify that 16 DIMMs are already in the compute node and occupy the following slots as follows: 1, 3, 5, 7, 10, 12, 14, 16, 17, 19, 21, 23, 26, 28, 30, 32.
    3. Remove the air baffle as shown in figure 8 and 9.
      Note: Ensure you wear an ESD wrist strap, while removing the air baffle.
      Figure 8. 9155-C14, 9155-C00 or 9155-C01 with air baffle in place
      9155-C00 or 9155-C01 with air baffle in place
      Figure 9. 9155-C14, 9155-C00 or 9155-C01 with air baffle removed
      9155-C14, 9155-C00 or 9155-C01 with air baffle removed
    4. Remove the 16 x 16GB DIMMs that are already in the compute node and should occupy the following slots: 1, 3, 5, 7, 10, 12, 14, 16, 17, 19, 21, 23, 26, 28, 30, 32.
      Open the retaining clips at each end of the DIMM slot and then lift the DIMM straight up while holding both ends of the DIMM.
      To remove the memory module, see https://www.youtube.com/watch?v=XhMSuvGtEqw.
    5. Add the 16 x 64GB DIMMs to the empty DIMM slots in the following slot order: 14, 30,16, 32, 3, 19, 1 ,17, 10, 26, 12, 28, 7, 23, 5, 21. For each slot, open the retaining clips (if not already open) at each end of the slot and then firmly press the new DIMM at both ends down into the slot, causing the tabs to snap to the locked position.
      To install the memory module, see https://www.youtube.com/watch?v=_N2LLsHk7lI.
    6. Refer to the DIMM installation order label to ensure all the DIMMs are installed in the dedicated slots as per the order.
    7. After adding the 16 DIMMs, verify that the following DIMM slots are now populated: 1, 3, 5, 7, 10, 12, 14, 16, 17, 19, 21, 23, 26, 28, 30, 32.
    8. Replace the air baffle.
  6. Replace the compute node top cover as follows:
    Note: Ensure you wear an ESD wrist strap, while replacing the compute node top cover.
    1. Ensure that the cover latch is in the open position.
    2. Position the cover over the chassis and then slide the cover forward.
    3. Press the cover latch down until it is closed.
    4. Turn the lock screw on the cover latch to the closed position.
      To replace the compute node top cover, see https://www.youtube.com/watch?v=7SVNkf6FIfU.
  7. Place the compute node back into its operating position.
    If the compute node is removed from the rack, place the compute node back on the rails using the “System Installation” illustration shown on the label on top of the compute node.
    Figure 10. Cover label illustrating system installation
    Cover label illustrating system installation
    1. Slide the compute node back into position in the rack enclosure and reconnect the power cables. Labels on the power cables indicate the rack position of the compute node where the cables should be installed.
    2. If required, verify the memory using the rack-mounted console.
    3. Connect the video, keyboard, and mouse of the rack-mounted console to the compute node.
    4. Press F1 while powering up the compute node to go into the system configuration utility.
    5. Select System Summary from the left menu and scroll down to view the DIMM information and verify that the DIMM Total Count and DIMM Total Capacity have the expected values.
    6. Power down the compute node.
    7. Reconnect the network cables. Ensure that compute nodes are installed on the dedicated racks using labels. Labels on the network cables indicate the rack position of the compute node where the cables should be installed.
  8. Power up the compute node as follows:
    After a compute node has been successfully upgraded, power up using these steps:
    1. In the Manage resource window, select the Power up option from the Resource action list. (This may take a few minutes.)
    2. Take the compute node out of maintenance mode.
      Ensure that the success notification is displayed.
  9. Repeat the steps for the other compute nodes in the rack enclosure.
    Repeat from the step “Before powering off the compute node” for each of the compute nodes in the IBM Storage Fusion HCI System system that is being upgraded.
  10. Verify the new configuration.
    Check the amount of memory shown in the management GUI for each compute node and compare it to the estimated amount, which should be 512GB (FC AHJK), 1024GB (RPQ 8S1881) or 2048GB (AHJN).