Adding building blocks to an existing ESS cluster

A building block is made up of a pair of Power® 8 servers and one or more storage enclosures. When adding building blocks in an ESS cluster, perform the following steps. You can add more than one building blocks at a time.

Attention: Be careful when running this procedure on a production system. Failure to follow the listed procedure might result in re-deployment of a running I/O server node, loss of quorum, and/or data loss. Be especially carefully before running gssdeploy -d or gsscheckdisks. It is always a good practice to back up the xCAT database prior to starting this procedure and for the customer to back up their data, if possible.
  1. Start of changePower on and boot the new Power 8 servers within the building block(s). Do not power up the associated storage enclosures.

    For example, a GL3C building block is made up of two Power 8 (PPC64LE) servers and three 4U106 enclosures. At this step, power on only the servers but do not apply power to the enclosures.

    End of change
  2. Add the new building block to the /etc/hosts file.
  3. Find the new building block serial numbers. The subnet and mask are typically 10.0.0.1/24 by default. However, check the customer FSP network.
    /var/tmp/gssdeploy -f subnet/mask
  4. Find rack positions that are used for GUI.
    /var/tmp/gssdeploy -i
  5. Update gssdeploy.cfg to make the following changes.
    • Change DEPLOYMENT_TYPE to ADD_BB.
    • Change GSS_GROUP to something other than gss_ppc64 or ces_ppc64. For example, new_bb.
    • Add the new serial numbers and node names to the gssdeploy.cfg file.
  6. Run gssdeploy -o to add the new building block to xCAT.
    /var/tmp/gssdeploy -o
  7. Run gssprecheck on the new node group and address any issues.
    /opt/ibm/gss/tools/samples/gssprecheck -G new_bb --install --file /var/tmp/gssdeploy.cfg
  8. Run gssdeploy -d to deploy the new building block.
    /var/tmp/gssdeploy -d
  9. After about 30 minutes, run health checks on the new building block and address any issues.
    1. gssstoragquickcheck -G new_bb
    2. gssfindmissingdisks -G new_bb
    3. GSSENV=INSTALL gsscheckdisks -G new_bb --encl all --iotest a --write-enable
  10. Perform the steps in this procedure: Set up the high-speed network.
  11. Run gssinstallcheck on the new group. Ignore any GPFS related settings for now.
    xdsh new_bb "/opt/ibm/gss/tools/bin/gssinstallcheck -N localhost" | xcoll -n
  12. Add the new nodes to the cluster.
    gssaddnode -G new_bb --suffix=-hs --accept-license --nodetype gss --cluster-node low_speed_name_of_cluster_node
  13. Create recovery groups.
    mmvdisk server configure --nc new_node_class --recycle one ; mmvdisk rg create --rg newrg1>,<newrg2 --nc new_node_class
  14. Create vdisks and NSDs. Make sure that you match the block size and the RAID code.
    mmvdisk vs define --vs new_vdisk_set --rg newrg1,newrg2 --code 8+2p --bs 8M --ss 100% --nsd-usage dataAndMetadata --sp system
    This is an example command. For best practices, see mmvdisk documentation in IBM Spectrum Scale RAID: Administration.
  15. Reboot both new I/O server nodes.
    xdsh new_bb "systemctl reboot"

    When both nodes are back up, you can start GPFS and move on to the next step.

  16. Add NSDs to an existing file system. For example:
    mmvdisk fs add --file-system existing_filesystem --vdisk-set new_vdisk_set
  17. Run restripe if needed.
    mmresripefs -b
  18. Update the performance monitoring list by using the mmchnode command. For more information, see mmchnode command.
  19. Put the new nodes back in the gss_ppc64 node group and delete the temporary group, and then comment out the GSS_GROUP line in gssdeploy.cfg.
    For example, add nodes to gss_ppc64 as follows.
    chdef -t group gss_ppc64 -o gssio3,gssio4 -p
  20. Update call home for new building blocks.
    gsscallhomeconf -E ems1 -N ems1,gss_ppc64 --suffix=-hs --register=all
Note: Now that you have additional building blocks, ensure that file system metadata replication is enabled (-m 2). The new pair of NSDs should be in a new failure group. The maximum number of failure groups is 5.