Procedure for restore

This topic describes the procedure for restoring data and the configuration.

Before you begin, ensure that the prerequisites are met and the preparation steps for the recovery site are performed. For more information, see Prerequisites for the recovery site.

Perform the following steps:

  1. Securely transfer (by using scp or other means) the cluster configuration backup files from each file system that were generated by the mmbackupconfig command on the primary site to a known location on the recovery site.
    To list the files on the primary site:
    
    root@primary-site ~]# ls -l /root/powerleBillionBack_gpfs_tctbill*
    -rw-r--r--. 1 root root 19345 Feb 23 11:24 /root/powerleBillionBack_gpfs_tctbill1_02232018
    -rw-r--r--. 1 root root 19395 Feb 23 11:25 /root/powerleBillionBack_gpfs_tctbill3_02232018
    
    1. Transfer the cluster_backup_config files for each file system to the recovery cluster, as follows:
      Note: If NSD servers are used, then transfer the backups to one of them.
      
      [root@primary-site ~]# scp powerleBillionBack_gpfs_tctbill1_02232018 
      root@recovery-site-nsd-server-node:/temp/ powerleBillionBack_gpfs_tctbill1_02232018
      scp powerleBillionBack_gpfs_tctbill3_02232018 root@recovery-site-nsd-server-node:/temp/
      powerleBillionBack_gpfs_tctbill1DB_02232018
  2. File system configuration restore
    1. Create the file system configuration restore file restore_out_file for each file system on the recovery site, as follows:
      mmrestoreconfig <file-system> -i <cluster_backup_config_file> -F <restore_out_file>
      For example,
      
      [root@ recovery-site-nsd-server-node ~]# mmrestoreconfig gpfs_tctbill1 -i 
      /roggr/powerleBillionBack_gpfs_tctbill1_02232018 -F 
      ./powerleBillionRestore_gpfs_tctbill1_02232018 
      
      mmrestoreconfig: Configuration file successfully created in 
      ./powerleBillionRestore_gpfs_tctbill1_02232018
      mmrestoreconfig: Command successfully completed
      
      The <restore_out_file> (powerleBillionRestore_gpfs_tctbill1_02232018 in this example) that is populated by the mmrestoreconfig command creates detailed stanzas for NSDs as they are on the primary site (these will need to be modified to match the NSD configuration on the recovery site). It also contains a detailed mmcrfs command that can be used to create the associated file system on the recovery site.
      Note: Disable Quota (remove the -Q yes option from this command) when you run it later in the process.
      Some excerpts from the restore_out_file (powerleBillionBack_gpfs_tctbill1_02232018):
      ## *************************************************************
      ## Filesystem configuration file backup for file system: gpfs_tctbill1
      ## Date Created: Tue Mar  6 14:15:05 CST 2018
      ##
      ## The '#' character is the comment character.  Any parameter
      ## modified herein should have any preceding '#' removed.
      ## **************************************************************
      
      ######## NSD configuration ##################
      ## Disk descriptor format for the mmcrnsd command.
      ## Please edit the disk and desired name fields to match
      ## your current hardware settings.
      ##
      ## The user then can uncomment the descriptor lines and
      ## use this file as input to the -F option.
      #
      ...
      Then it lists all the nsds (15 of them in this case):
      # %nsd:
      #   device=DiskName
      #   nsd=nsd11
      #   usage=dataOnly
      #   failureGroup=1
      #   pool=system
      #
      # %nsd:
      #   device=DiskName
      #   nsd=nsd12
      #   usage=dataOnly
      #   failureGroup=1
      #   pool=system
      #
      
      etc....
      
      # %pool:
      #   pool=system
      #   blockSize=4194304
      #   usage=dataAndMetadata
      #   layoutMap=scatter
      #   allowWriteAffinity=no
      #
      ######### File system configuration #############
      ## The user can use the predefined options/option values
      ## when recreating the filesystem.  The option values
      ## represent values from the backed up filesystem.
      #
      # mmcrfs FS_NAME NSD_DISKS -i 4096 -j scatter -k nfs4 -n 100 -B 4194304 -Q yes
       --version 5.0.0.0 -L 33554432 -S relatime -T /ibm/gpfs_tctbill1 --inode-limit 
       407366656:307619840
      #
      # When preparing the file system for image restore, quota
      # enforcement must be disabled at file system creation time.
      # If this is not done, the image restore process will fail.
      ...
      ####### Disk Information #######
      ## Number of disks 15
      ## nsd11 991486976
      ## nsd12 991486976
      etc....
      ## nsd76 1073741824
      
      ## Number of Storage Pools 1
      ## system 15011648512
      etc...
      ###### Policy Information ######
      ## /* DEFAULT */
      ## /* Store data in the first data pool or system pool */
      
      ########## Fileset Information #####
      ## NFS_tctbill1 Linked /ibm/gpfs_tctbill1/NFS_tctbill1 off Comment:
      etc...
      
      ###### Quota Information ######
      ## Type Name USR root
      etc...
      
      Example portion of modified nsd stanzas for <restore_out_file>:
      
      %nsd:
         device=/dev/mapper/mpaths
         nsd=nsd47
         servers=nsdServer1,nsdServer2
         usage=dataOnly
         failureGroup=1
         pool=system
      
       %nsd:
         device=/dev/mapper/mpatht
         nsd=nsd48
         servers= nsdServer2,nsdServer1
         usage=dataOnly
         failureGroup=1
         pool=system
      
       %nsd:
         device=/dev/mapper/mpathbz
         nsd=nsd49
         servers= nsdServer1,nsdServer2
         usage=metadataOnly
         failureGroup=1
         pool=system
      
      
    2. Modify the restore_out_file to match the configuration on the recovery site. Example portion of modified nsd stanzas for restore_out_file is as follows:
      %nsd:
      device=/dev/mapper/mpaths
      nsd=nsd47
      servers=nsdServer1,nsdServer2
      usage=dataOnly
      failureGroup=1
      pool=system
      
      %nsd:
      device=/dev/mapper/mpatht
      nsd=nsd48
      servers= nsdServer2,nsdServer1
      usage=dataOnly
      failureGroup=1
      pool=system
      
      %nsd:
      device=/dev/mapper/mpathbz
      nsd=nsd49
      servers= nsdServer1,nsdServer2
      usage=metadataOnly
      failureGroup=1
      pool=system
      
  3. Create recovery-site NSDs if necessary.
    1. Use the newly modified restore_out_file (powerleBillionRestore_gpfs_tctbill1_02232018_nsd in this example) to create NSDs on the recovery cluster. This command must be run from an NSD server node (if NSD servers are in use):
      
      [root@recovery-site-nsd-server-node ~]# mmcrnsd -F 
      /temp/powerleBillionRestore_gpfs_tctbill1_02232018_nsd
      mmcrnsd: Processing disk mapper/mpathq
      etc...
      mmcrnsd: Processing disk mapper/mpathcb
      mmcrnsd: Propagating the cluster configuration data to all
      affected nodes.  This is an asynchronous process.
      
    2. Repeat the mmcrnsd command appropriately for each file system that you want to recover.
  4. Create recovery-site file systems if necessary.
    1. Use the same modified restore_out_file (powerleBillionRestore_gpfs_tctbill1_02232018_nsd in this example) as input for the mmcrfs command, which will create the file system. The following example is based on the command included in the <restore_out_file> (note the '-Q yes' option has been removed). For example,
      
      root@recovery-site-nsd-server-node ~]# mmcrfs gpfs_tctbill1 -F 
      /temp/powerleBillionRestore_gpfs_tctbill1_02232018_nsd
      -i 4096 -j scatter -k nfs4 -n 100 -B 4194304  --version 5.0.0.0 -L 33554432 -S
      relatime -T /ibm/gpfs_tctbill1 --inode-limit 407366656:307619840
      
    2. Repeat the mmcrfs command appropriately for each file system that you want to recover.
  5. Cloud services configuration restore (download SOBAR backup from the cloud for the file system).
    1. Securely transfer the Cloud services configuration file to the desired location by using scp or any other commands.
    2. From the appropriate Cloud services server node on the recovery site (a node from the recovery Cloud services node class), download the SOBAR.tar by using the mcstore_sobar_download.sh script. This script is there in the /opt/ibm/MCStore/scripts folder on your Cloud services node.
      Note: Make sure your local_backup_dir is mounted and has sufficient space to accommodate the SOBAR backup file. It is recommended to use a GPFS file system.
      
      Usage: mcstore_sobar_download.sh <tct_config_backup_path> <sharing_container_pairset_name> 
      <node-class-name> <sobar_backup_tar_name> <local_backup_dir>
      For example,
      
      [root@recovery-site-tct-node scripts]# ./mcstore_sobar_download.sh 
      /temp/TCT_backupConfig_20180306_123302.tar powerleSOBAR1 TCTNodeClassPowerLE 
      9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar /ibm/gpfs_tct_SOBAR1/
      
      
      You are about to restore the TCT Configuration settings to the CCR.
      Any new settings since the backup was made will be lost.
      The TCT servers should be stopped prior to this operation.
      
      Do you want to continue and restore the TCT cluster configuration?
      Enter "yes" to continue: yes
      
      mmcloudgateway: Unpacking the backup config tar file...
      mmcloudgateway: Completed unpacking the tar file.
      
      Restoring the Files:
      [mmcloudgateway.conf  -  Restored]
      [_tctkeystore.jceks  -  Restored]
      [_tctnodeclasspowerle.settings  -  Restored to version 96]
      
      mmcloudgateway: TCT Config files were restored to the CCR.
      mmcloudgateway: Command completed.
      mmcloudgateway: Sending the command to node recovery-site-tct-node.
      Stopping the Transparent Cloud Tiering service.
      mmcloudgateway: The command completed on node recovery-site-tct-node.
      
      mmcloudgateway: Sending the command to node recovery-site-tct-node2.
      Stopping the Transparent Cloud Tiering service.
      mmcloudgateway: The command completed on node recovery-site-tct-node2.
      mmcloudgateway: Command completed.
      
      etc...
      mmcloudgateway: Sending the command to node recovery-site-tct-node.
      Starting the Transparent Cloud Tiering service...
      mmcloudgateway: The command completed on node recovery-site-tct-node.
      
      etc...
      
      Making sure Transparent Service to start on all nodes.
      Please wait as this will take some time..
      
      Downloading 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar from cloud. 
      This will take some time based on the size of the backup file. 
      Please wait until download completes..
      Download of 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar from cloud
      completed successfully.
      Moving Backup tar 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar under 
      /ibm/gpfs_tct_SOBAR1/
      Note: Before running mcstore_sobar_restore.sh to restore the file system metadata,
      make sure that file system to be restored is clean and never been mounted for write.
  6. File system configuration restore (Restore file system configuration on the recovery site)
    Note: If your temporary restore staging space is on a Cloud services managed file system, then you will have to delete and recreate this Cloud services managed file system at this point.
    1. Restore policies for each file system using the mmrestoreconfig command.
      Usage: mmrestoreconfig Device -i InputFile --image-restore
      For example,
      
      [root@recovery-site-tct-node ]# mmrestoreconfig gpfs_tctbill1 -i 
      /temp/powerleBillionBack_gpfs_tctbill1_02232018 --image-restore
      
      --------------------------------------------------------
      Configuration restore of gpfs_tctbill1 begins at Fri Mar 16 05:48:06 CDT 2018.
      --------------------------------------------------------
      mmrestoreconfig: Checking disk settings for gpfs_tctbill1:
      mmrestoreconfig: Checking the number of storage pools defined for gpfs_tctbill1.
      mmrestoreconfig: Checking storage pool names defined for gpfs_tctbill1.
      mmrestoreconfig: Checking storage pool size for 'system'.
      
      mmrestoreconfig: Checking filesystem attribute configuration for gpfs_tctbill1:
      
      mmrestoreconfig: Checking fileset configurations for gpfs_tctbill1:
      Fileset NFS_tctbill1 created with id 1 root inode 536870915.
      Fileset NFS_tctbill1_bkg created with id 2 root inode 1073741827.
      Fileset NFS_tctbill1_bkg1 created with id 3 root inode 1610612739.
      
      mmrestoreconfig: Checking policy rule configuration for gpfs_tctbill1:
      Restoring backed up policy file.
      Validated policy 'policyfile.backup':
      Policy `policyfile.backup' installed and broadcast to all nodes.
      mmrestoreconfig: Command successfully completed
      
      
  7. Restore file system metadata using the mcstore_sobar_restore.sh script found in the /opt/ibm/MCStore/scripts folder. The mcstore_sobar_restore.sh script does the following:
    1. The mcstore_sobar_restore.sh script does the following:
      • Untars the sobar_backup_file
      • Stops the Cloud services for the specified node class
      • Unmounts the recovery file system and re-mounts read-only
      • Restores the recovery file system image
      • Re-mounts the recovery file system in read/write
      • Enables and restarts Cloud services
      • Executes the file curation policy - changing objects from Co-Resident state to Non-Resident state
      • Rebuilds the Cloud services database files if you choose to do so
      Note: If Cloud directory is pointing to another file system, make sure that the file system is mounted correctly before you run the restore script providing the rebuildDB parameter value to yes.
      
      [root@recovery-site-tct-node scripts]# ./mcstore_sobar_restore.sh 
      /ibm/gpfs_tct_SOBAR1/9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar gpfs_tctbill1
      TCTNodeClassPowerLE yes /ibm/gpfs_tct_SOBAR1 >> /root/status.txt
      
      etc...
      
      [I] RESTORE:[I] This task restored 1310720 inodes
      [I] A total of 307 PDRs from filelist /dev/null have been processed; 0 'skipped' records
      and/or errors.
      [I] Finishing restore with conclude operations.
      [I] CONCLUDE:[I] Starting image restore pipeline
      [I] A total of 307 files have been migrated, deleted or processed by an 
      EXTERNAL EXEC/script;
      0 'skipped' files and/or errors.
      Fri Mar 16 17:20:29 CDT 2018: mmumount: Unmounting file systems ...
      Fri Mar 16 17:20:33 CDT 2018: mmmount: Mounting file systems ...
      
      etc.....
      
      Running file curation policy and converting co-resident files to Non resident.
      This will take some time. Please wait until this completes..
      
      [I] GPFS Current Data Pool Utilization in KB and %
      Pool_Name                   KB_Occupied        KB_Total  Percent_Occupied
      system                           327680     12287688704      0.002666734%
      [I] 307944153 of 410512384 inodes used: 75.014583%.
      [I] Loaded policy rules from /opt/ibm/MCStore/samples/CoresToNonres.sobar.template.
      Evaluating policy rules with CURRENT_TIMESTAMP = 2018-03-16@22:24:28 UTC
      Parsed 2 policy rules.
      
      etc...
      Completed file curation policy execution of converting co-resident files to 
      Non resident files.
      running rebuild db for all the tiering containers for the given file system : gpfs_tctbill1
      Running rebuild db for container pairset : powerlebill1spill2 and File System: gpfs_tctbill1
      mmcloudgateway: Command completed.
      Running rebuild db for container pairset : powerlebill1spill1 and File System: gpfs_tctbill1
      mmcloudgateway: Command completed.
      Running rebuild db for container pairset : powerlebill1 and File System: gpfs_tctbill1
      
      etc...
      
    2. Repeat the mcstore_sobar_restore.sh script appropriately for each file system that you want to recover.
  8. Enable Cloud services maintenance operations on the appropriate node class being restored on the recovery site. For more information, see Configuring the maintenance windows.
  9. Enable all Cloud services migration policies on the recovery site by using the --transparent-recalls {ENABLE} option in the mmcloudgateway containerPairSet update command. For more information, see Binding your file system or fileset to the Cloud service by creating a container pair set.