IBM Storage Scale System 3200 MES upgrade

Beginning with IBM Storage Scale System 3200 5.3.2, it is possible to grow specific IBM Storage Scale System 3200 building blocks by adding one or two enclosures. This is called an MES (miscellaneous equipment specification) upgrade, which is performed using the mmvdisk recoverygroup resize command.

Important: The MES upgrade of an IBM Storage Scale System 3200 building block requires the participation of IBM® Service because it involves the careful coordination of IBM Storage Scale System 3200 hardware and IBM Storage Scale RAID software activity. This topic is limited to outlining how the mmvdisk command facilitates the IBM Storage Scale RAID part of the process. For more information about the integrated hardware and software aspects of an IBM Storage Scale System 3200 MES upgrade, see Deploying the Elastic Storage Server.
The five permitted MES upgrade paths are described in MES upgrade paths:
Table 1. MES upgrade paths
Existing IBM Storage Scale System 3200 building block MES upgrade IBM Storage Scale System 3200 building block Enclosures added
GS1S GS2S One 5147-024 enclosure
GS2S GS4S Two 5147-024 enclosures
GL1S GL2S One 5147-084 enclosure
GL2S GL4S Two 5147-084 enclosures
GL4S GL6S Two 5147-084 enclosures

The new enclosures must be the same model as the existing enclosures and contain the same type and size disks.

Multiple MES upgrades must be performed in increments. For example, it is not possible to go directly from a GS1S to a GS4S; the intermediary MES upgrade of the GS1S to GS2S must be completed before the GS2S to GS4S MES upgrade can be performed.

The following example assumes that there is a GL2S building block with a nearly full file system (fs1) that uses all the GL2S capacity.

The vdisk set sizing for the file system vdisk set and the GL2S recovery groups shows no unclaimed space in the declustered arrays:
# mmvdisk vs list --vs fs1

                     member vdisks
vdisk set       count   size   raw size  created  file system and attributes
--------------  ----- -------- --------  -------  --------------------------
fs1                 2  350 TiB  483 TiB  yes      fs1, DA1, 8+3p, 16 MiB, dataAndMetadata, system


                declustered                 capacity            all vdisk sets defined
recovery group     array     type  total raw  free raw  free%  in the declustered array
--------------  -----------  ----  ---------  --------  -----  ------------------------
BB03L           DA1          HDD     483 TiB    28 GiB     0%  fs1
BB03R           DA1          HDD     483 TiB    28 GiB     0%  fs1

                  vdisk set map memory per server
node class  available  required  required per vdisk set
----------  ---------  --------  ----------------------
BB03          111 GiB  8925 MiB  fs1 (8540 MiB)

It is decided to double the size of the recovery groups with a GL2S to GL4S MES upgrade and to add the new capacity to the existing file system.

Before the new enclosures are connected and verified, the user declustered array DA1 in each of the GL2S recovery groups will show 83 pdisks and 2 pdisk-equivalent spare space:
# mmvdisk rg list --rg BB03L --da

declustered   needs    vdisks      pdisks     replace        capacity        scrub
   array     service  user log  total spare  threshold  total raw free raw  duration  background task
-----------  -------  ---- ---  ----- -----  ---------  --------- --------  --------  ---------------
NVR          no          0   1      2     0          1          0        0   14 days  scrub (84%)
SSD          no          0   1      1     0          1          0        0   14 days  scrub (4%)
DA1          no          1   1     83     2          2    483 TiB   28 GiB   14 days  scrub (37%)

mmvdisk: Total capacity is the raw space before any vdisk set definitions.
mmvdisk: Free capacity is what remains for additional vdisk set definitions.

IBM Service installs the two new enclosures following the GL2S to GL4S MES upgrade procedure.

When the new enclosures have been verified to be correctly installed and configured on the servers, IBM service will perform the IBM Storage Scale RAID software part of the procedure.

First, the GL2S recovery groups must not have any failed disks:
# mmvdisk pdisk list --rg BB03L,BB03R --not-ok
mmvdisk: All pdisks of the specified recovery groups are ok.
The servers for the GL2S building block must now see a perfect match for the newly connected and upgraded GL4S disk topology:
# mmvdisk server list --disk-topology --nc BB03

 node                                       needs    matching
number  server                            attention   metric   disk topology
------  --------------------------------  ---------  --------  -------------
     1  server05.gpfs.net                 no          100/100  ESS GL4S
     2  server06.gpfs.net                 no          100/100  ESS GL4S
When the new building block configuration is seen perfectly by the servers, IBM Service will "resize" the recovery group pair to use the upgraded GL4S capacity:
# mmvdisk rg resize --rg BB03L,BB03R
mmvdisk: Obtaining pdisk information for recovery group 'BB03L'.
mmvdisk: Obtaining pdisk information for recovery group 'BB03R'.
mmvdisk: Analyzing disk topology for node 'server05.gpfs.net'.
mmvdisk: Analyzing disk topology for node 'server06.gpfs.net'.
mmvdisk: Validating existing pdisk locations for recovery group 'BB03L'.
mmvdisk: Validating existing pdisk locations for recovery group 'BB03R'.
mmvdisk: The resized server disk topology is 'ESS GL4S'.
mmvdisk: Validating declustered arrays for recovery group 'BB03L'.
mmvdisk: Validating declustered arrays for recovery group 'BB03R'.
mmvdisk: Adding new pdisks to recovery group 'BB03L'.
mmvdisk: Updating declustered array attributes for recovery group 'BB03L'.
mmvdisk: Adding new pdisks to recovery group 'BB03R'.
mmvdisk: Updating declustered array attributes for recovery group 'BB03R'.
mmvdisk: Successfully resized recovery groups 'BB03L' and 'BB03R'.
After the recovery group pair is resized, the user declustered array DA1 in each of the recovery groups has 167 pdisks with 4 pdisk-equivalent spare space, which is the standard for GL4S. IBM Storage Scale RAID also begins rebalancing vdisk data across the new DA1 capacity:
# mmvdisk rg list --rg BB03L --da

declustered   needs    vdisks      pdisks     replace        capacity        scrub
   array     service  user log  total spare  threshold  total raw free raw  duration  background task
-----------  -------  ---- ---  ----- -----  ---------  --------- --------  --------  ---------------
NVR          no          0   1      2     0          1          0        0   14 days  scrub (88%)
SSD          no          0   1      1     0          1          0        0   14 days  scrub (4%)
DA1          no          1   1    167     4          2    972 TiB  489 TiB   14 days  rebalance (0%)

mmvdisk: Total capacity is the raw space before any vdisk set definitions.
mmvdisk: Free capacity is what remains for additional vdisk set definitions.
The doubling in size of the recovery group pair means that there is now 50% free capacity for the creation of additional vdisk sets to increase the size of file system fs1:
# mmvdisk vs list --vs all

                     member vdisks
vdisk set       count   size   raw size  created  file system and attributes
--------------  ----- -------- --------  -------  --------------------------
fs1                 2  350 TiB  483 TiB  yes      fs1, DA1, 8+3p, 16 MiB, dataAndMetadata, system

                declustered                 capacity            all vdisk sets defined
recovery group     array     type  total raw  free raw  free%  in the declustered array
--------------  -----------  ----  ---------  --------  -----  ------------------------
BB03L           DA1          HDD     972 TiB   489 TiB    50%  fs1
BB03R           DA1          HDD     972 TiB   489 TiB    50%  fs1

                  vdisk set map memory per server
node class  available  required  required per vdisk set
----------  ---------  --------  ----------------------
BB03          111 GiB  8925 MiB  fs1 (8540 MiB)

To double the size of the file system, a new vdisk set can be defined, created, and added to the file system. The mmvdisk vdiskset define command has an option to define a new vdisk set by copying the definition of an existing vdisk set. This ensures that the new vdisk set will have exactly the same size vdisk NSDs as those already in the file system.

To prepare for making a copy of a vdisk set, first the existing vdisk set is renamed from fs1 to fs1.vs1:
# mmvdisk vs rename --vs fs1 --new-name fs1.vs1
mmvdisk: Vdisk set 'fs1' renamed to 'fs1.vs1'.
Copying a vdisk set definition checks that the declustered array for the copy is compatible with the declustered array from the original definition. This leads to the paradoxical situation where the attempt to copy vdisk set fs1.vs1 to a new vdisk set fs1.vs2 in the same recovery groups and declustered arrays is found to be incompatible:
# mmvdisk vs define --vs fs1.vs2 --copy fs1.vs1
mmvdisk: Declustered array 'DA1' in recovery group 'BB03L' is incompatible with
mmvdisk: recovery group 'BB03L' in vdisk set 'fs1.vs1' because:
mmvdisk:   Different pdisks per server in the declustered array, 167 versus 83.
mmvdisk: Declustered array 'DA1' in recovery group 'BB03R' is incompatible with
mmvdisk: recovery group 'BB03L' in vdisk set 'fs1.vs1' because:
mmvdisk:   Different pdisks per server in the declustered array, 167 versus 83.
mmvdisk: Command failed. Examine previous error messages to determine cause.
This is because the original vdisk set was defined when there were only 83 pdisks in DA1, and the copy is being made after DA1 was resized to 167 pdisks. The --force-incompatible flag may be used to allow the copy to be made even though the number of pdisks is now different:
# mmvdisk vs define --vs fs1.vs2 --copy fs1.vs1 --force-incompatible
mmvdisk: Vdisk set 'fs1.vs2' has been defined.
mmvdisk: Recovery group 'BB03L' has been defined in vdisk set 'fs1.vs2'.
mmvdisk: Recovery group 'BB03R' has been defined in vdisk set 'fs1.vs2'.

                     member vdisks
vdisk set       count   size   raw size  created  file system and attributes
--------------  ----- -------- --------  -------  --------------------------
fs1.vs2             2  350 TiB  483 TiB  no       -, DA1, 8+3p, 16 MiB, dataAndMetadata, system

                declustered                 capacity            all vdisk sets defined
recovery group     array     type  total raw  free raw  free%  in the declustered array
--------------  -----------  ----  ---------  --------  -----  ------------------------
BB03L           DA1          HDD     972 TiB  6533 GiB     0%  fs1.vs1, fs1.vs2
BB03R           DA1          HDD     972 TiB  6533 GiB     0%  fs1.vs1, fs1.vs2

                  vdisk set map memory per server
node class  available  required  required per vdisk set
----------  ---------  --------  ----------------------
BB03          111 GiB    17 GiB  fs1.vs1 (8540 MiB), fs1.vs2 (8540 MiB)

Together, the original vdisk set fs1.vs1 and the copied vdisk set fs1.vs2 use all the space in the resized declustered arrays.

The copied vdisk set can then be created and added to file system fs1, doubling its size:
# mmvdisk vs create --vs fs1.vs2
mmvdisk: 2 vdisks and 2 NSDs will be created in vdisk set 'fs1.vs2'.
mmvdisk: (mmcrvdisk) [I] Processing vdisk RG005VS002
mmvdisk: (mmcrvdisk) [I] Processing vdisk RG006VS002
mmvdisk: Created all vdisks in vdisk set 'fs1.vs2'.
mmvdisk: (mmcrnsd) Processing disk RG005VS002
mmvdisk: (mmcrnsd) Processing disk RG006VS002
mmvdisk: Created all NSDs in vdisk set 'fs1.vs2'.
# mmvdisk fs add --fs fs1 --vs fs1.vs2
mmvdisk: The following disks of fs1 will be formatted on node fsmgr01:
mmvdisk:     RG005VS002: size 367003904 MB
mmvdisk:     RG006VS002: size 367003904 MB
mmvdisk: Extending Allocation Map
mmvdisk: Checking Allocation Map for storage pool system
mmvdisk:   47 % complete on Tue Nov  6 22:22:27 2018
mmvdisk:   51 % complete on Tue Nov  6 22:22:32 2018
mmvdisk:   94 % complete on Tue Nov  6 22:22:37 2018
mmvdisk:  100 % complete on Tue Nov  6 22:22:37 2018
mmvdisk: Completed adding disks to file system fs1.
# mmvdisk fs list --fs fs1

                                vdisk     list of       holds    holds  storage
vdisk set       recovery group  count  failure groups  metadata  data    pool
--------------  --------------  -----  --------------  --------  -----  -------
fs1.vs1         BB03L               1               1  yes       yes    system
fs1.vs1         BB03R               1               2  yes       yes    system
fs1.vs2         BB03L               1               1  yes       yes    system
fs1.vs2         BB03R               1               2  yes       yes    system

#