Procedure for restore
This topic describes the procedure for restoring data and the configuration.
Before you begin, ensure that the prerequisites are met and the preparation steps for the
recovery site are performed. For more information, see Prerequisites for the recovery site.
Perform the following steps:
-
Securely transfer (by using
scp
or other means) the cluster configuration backup files from each file system that were generated by the mmbackupconfig command on the primary site to a known location on the recovery site.To list the files on the primary site:root@primary-site ~]# ls -l /root/powerleBillionBack_gpfs_tctbill* -rw-r--r--. 1 root root 19345 Feb 23 11:24 /root/powerleBillionBack_gpfs_tctbill1_02232018 -rw-r--r--. 1 root root 19395 Feb 23 11:25 /root/powerleBillionBack_gpfs_tctbill3_02232018
- Transfer the cluster_backup_config files for each file system to the
recovery cluster, as follows:Note: If NSD servers are used, then transfer the backups to one of them.
[root@primary-site ~]# scp powerleBillionBack_gpfs_tctbill1_02232018 root@recovery-site-nsd-server-node:/temp/ powerleBillionBack_gpfs_tctbill1_02232018
scp powerleBillionBack_gpfs_tctbill3_02232018 root@recovery-site-nsd-server-node:/temp/ powerleBillionBack_gpfs_tctbill1DB_02232018
- Transfer the cluster_backup_config files for each file system to the
recovery cluster, as follows:
-
File system configuration restore
- Create the file system configuration restore file restore_out_file for each file
system on the recovery site, as follows:
For example,mmrestoreconfig <file-system> -i <cluster_backup_config_file> -F <restore_out_file>
[root@ recovery-site-nsd-server-node ~]# mmrestoreconfig gpfs_tctbill1 -i /roggr/powerleBillionBack_gpfs_tctbill1_02232018 -F ./powerleBillionRestore_gpfs_tctbill1_02232018
The <restore_out_file> (powerleBillionRestore_gpfs_tctbill1_02232018 in this example) that is populated by the mmrestoreconfig command creates detailed stanzas for NSDs as they are on the primary site (these will need to be modified to match the NSD configuration on the recovery site). It also contains a detailed mmcrfs command that can be used to create the associated file system on the recovery site.mmrestoreconfig: Configuration file successfully created in ./powerleBillionRestore_gpfs_tctbill1_02232018 mmrestoreconfig: Command successfully completed
Note: Disable Quota (remove the -Q yes option from this command) when you run it later in the process.Some excerpts from the restore_out_file (powerleBillionBack_gpfs_tctbill1_02232018):
Example portion of modified nsd stanzas for <restore_out_file>:## ************************************************************* ## Filesystem configuration file backup for file system: gpfs_tctbill1 ## Date Created: Tue Mar 6 14:15:05 CST 2018 ## ## The '#' character is the comment character. Any parameter ## modified herein should have any preceding '#' removed. ## ************************************************************** ######## NSD configuration ################## ## Disk descriptor format for the mmcrnsd command. ## Please edit the disk and desired name fields to match ## your current hardware settings. ## ## The user then can uncomment the descriptor lines and ## use this file as input to the -F option. # ... Then it lists all the nsds (15 of them in this case): # %nsd: # device=DiskName # nsd=nsd11 # usage=dataOnly # failureGroup=1 # pool=system # # %nsd: # device=DiskName # nsd=nsd12 # usage=dataOnly # failureGroup=1 # pool=system # etc.... # %pool: # pool=system # blockSize=4194304 # usage=dataAndMetadata # layoutMap=scatter # allowWriteAffinity=no # ######### File system configuration ############# ## The user can use the predefined options/option values ## when recreating the filesystem. The option values ## represent values from the backed up filesystem. # # mmcrfs FS_NAME NSD_DISKS -i 4096 -j scatter -k nfs4 -n 100 -B 4194304 -Q yes --version 5.0.0.0 -L 33554432 -S relatime -T /ibm/gpfs_tctbill1 --inode-limit 407366656:307619840 # # When preparing the file system for image restore, quota # enforcement must be disabled at file system creation time. # If this is not done, the image restore process will fail. ... ####### Disk Information ####### ## Number of disks 15 ## nsd11 991486976 ## nsd12 991486976 etc.... ## nsd76 1073741824 ## Number of Storage Pools 1 ## system 15011648512 etc... ###### Policy Information ###### ## /* DEFAULT */ ## /* Store data in the first data pool or system pool */ ########## Fileset Information ##### ## NFS_tctbill1 Linked /ibm/gpfs_tctbill1/NFS_tctbill1 off Comment: etc... ###### Quota Information ###### ## Type Name USR root etc...
%nsd: device=/dev/mapper/mpaths nsd=nsd47 servers=nsdServer1,nsdServer2 usage=dataOnly failureGroup=1 pool=system %nsd: device=/dev/mapper/mpatht nsd=nsd48 servers= nsdServer2,nsdServer1 usage=dataOnly failureGroup=1 pool=system %nsd: device=/dev/mapper/mpathbz nsd=nsd49 servers= nsdServer1,nsdServer2 usage=metadataOnly failureGroup=1 pool=system
- Modify the restore_out_file to match the configuration on the recovery site.
Example portion of modified nsd stanzas for restore_out_file is as follows:
%nsd: device=/dev/mapper/mpaths nsd=nsd47 servers=nsdServer1,nsdServer2 usage=dataOnly failureGroup=1 pool=system %nsd: device=/dev/mapper/mpatht nsd=nsd48 servers= nsdServer2,nsdServer1 usage=dataOnly failureGroup=1 pool=system %nsd: device=/dev/mapper/mpathbz nsd=nsd49 servers= nsdServer1,nsdServer2 usage=metadataOnly failureGroup=1 pool=system
- Create the file system configuration restore file restore_out_file for each file
system on the recovery site, as follows:
-
Create recovery-site NSDs if necessary.
- Use the newly modified restore_out_file
(powerleBillionRestore_gpfs_tctbill1_02232018_nsd in this example) to create
NSDs on the recovery cluster. This command must be run from an NSD server node (if NSD servers are
in
use):
[root@recovery-site-nsd-server-node ~]# mmcrnsd -F /temp/powerleBillionRestore_gpfs_tctbill1_02232018_nsd mmcrnsd: Processing disk mapper/mpathq etc... mmcrnsd: Processing disk mapper/mpathcb mmcrnsd: Propagating the cluster configuration data to all affected nodes. This is an asynchronous process.
- Repeat the mmcrnsd command appropriately for each file system that you want to recover.
- Use the newly modified restore_out_file
(powerleBillionRestore_gpfs_tctbill1_02232018_nsd in this example) to create
NSDs on the recovery cluster. This command must be run from an NSD server node (if NSD servers are
in
use):
-
Create recovery-site file systems if necessary.
- Use the same modified restore_out_file
(powerleBillionRestore_gpfs_tctbill1_02232018_nsd in this example) as input for
the mmcrfs command, which will create the file system. The following example is
based on the command included in the <restore_out_file> (note the '-Q yes' option has been
removed). For
example,
root@recovery-site-nsd-server-node ~]# mmcrfs gpfs_tctbill1 -F /temp/powerleBillionRestore_gpfs_tctbill1_02232018_nsd -i 4096 -j scatter -k nfs4 -n 100 -B 4194304 --version 5.0.0.0 -L 33554432 -S relatime -T /ibm/gpfs_tctbill1 --inode-limit 407366656:307619840
- Repeat the mmcrfs command appropriately for each file system that you want to recover.
- Use the same modified restore_out_file
(powerleBillionRestore_gpfs_tctbill1_02232018_nsd in this example) as input for
the mmcrfs command, which will create the file system. The following example is
based on the command included in the <restore_out_file> (note the '-Q yes' option has been
removed). For
example,
-
Cloud services configuration restore (download SOBAR
backup from the cloud for the file system).
- Securely transfer the Cloud services configuration file to the desired location by using scp or any other commands.
- From the appropriate Cloud services server node on the
recovery site (a node from the recovery Cloud services node
class), download the SOBAR.tar by using the
mcstore_sobar_download.sh script. This script is there in the
/opt/ibm/MCStore/scripts folder on your Cloud services node. Note: Make sure your local_backup_dir is mounted and has sufficient space to accommodate the SOBAR backup file. It is recommended to use a GPFS file system.For example,
Usage: mcstore_sobar_download.sh <tct_config_backup_path> <sharing_container_pairset_name> <node-class-name> <sobar_backup_tar_name> <local_backup_dir>
[root@recovery-site-tct-node scripts]# ./mcstore_sobar_download.sh /temp/TCT_backupConfig_20180306_123302.tar powerleSOBAR1 TCTNodeClassPowerLE 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar /ibm/gpfs_tct_SOBAR1/
You are about to restore the TCT Configuration settings to the CCR. Any new settings since the backup was made will be lost. The TCT servers should be stopped prior to this operation. Do you want to continue and restore the TCT cluster configuration? Enter "yes" to continue: yes mmcloudgateway: Unpacking the backup config tar file... mmcloudgateway: Completed unpacking the tar file. Restoring the Files: [mmcloudgateway.conf - Restored] [_tctkeystore.jceks - Restored] [_tctnodeclasspowerle.settings - Restored to version 96] mmcloudgateway: TCT Config files were restored to the CCR. mmcloudgateway: Command completed. mmcloudgateway: Sending the command to node recovery-site-tct-node. Stopping the Transparent Cloud Tiering service. mmcloudgateway: The command completed on node recovery-site-tct-node. mmcloudgateway: Sending the command to node recovery-site-tct-node2. Stopping the Transparent Cloud Tiering service. mmcloudgateway: The command completed on node recovery-site-tct-node2. mmcloudgateway: Command completed. etc... mmcloudgateway: Sending the command to node recovery-site-tct-node. Starting the Transparent Cloud Tiering service... mmcloudgateway: The command completed on node recovery-site-tct-node. etc... Making sure Transparent Service to start on all nodes. Please wait as this will take some time.. Downloading 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar from cloud. This will take some time based on the size of the backup file. Please wait until download completes.. Download of 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar from cloud completed successfully. Moving Backup tar 9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar under /ibm/gpfs_tct_SOBAR1/ Note: Before running mcstore_sobar_restore.sh to restore the file system metadata, make sure that file system to be restored is clean and never been mounted for write.
-
File system configuration restore (Restore file system configuration on the recovery
site)
Note: If your temporary restore staging space is on a Cloud services managed file system, then you will have to delete and recreate this Cloud services managed file system at this point.
- Restore policies for each file system using the mmrestoreconfig
command.
For example,Usage: mmrestoreconfig Device -i InputFile --image-restore
[root@recovery-site-tct-node ]# mmrestoreconfig gpfs_tctbill1 -i /temp/powerleBillionBack_gpfs_tctbill1_02232018 --image-restore -------------------------------------------------------- Configuration restore of gpfs_tctbill1 begins at Fri Mar 16 05:48:06 CDT 2018. -------------------------------------------------------- mmrestoreconfig: Checking disk settings for gpfs_tctbill1: mmrestoreconfig: Checking the number of storage pools defined for gpfs_tctbill1. mmrestoreconfig: Checking storage pool names defined for gpfs_tctbill1. mmrestoreconfig: Checking storage pool size for 'system'. mmrestoreconfig: Checking filesystem attribute configuration for gpfs_tctbill1: mmrestoreconfig: Checking fileset configurations for gpfs_tctbill1: Fileset NFS_tctbill1 created with id 1 root inode 536870915. Fileset NFS_tctbill1_bkg created with id 2 root inode 1073741827. Fileset NFS_tctbill1_bkg1 created with id 3 root inode 1610612739. mmrestoreconfig: Checking policy rule configuration for gpfs_tctbill1: Restoring backed up policy file. Validated policy 'policyfile.backup': Policy `policyfile.backup' installed and broadcast to all nodes. mmrestoreconfig: Command successfully completed
- Restore policies for each file system using the mmrestoreconfig
command.
-
Restore file system metadata using the mcstore_sobar_restore.sh script
found in the /opt/ibm/MCStore/scripts folder. The mcstore_sobar_restore.sh
script does the following:
- The mcstore_sobar_restore.sh script does the following:
- Untars the sobar_backup_file
- Stops the Cloud services for the specified node class
- Unmounts the recovery file system and re-mounts read-only
- Restores the recovery file system image
- Re-mounts the recovery file system in read/write
- Enables and restarts Cloud services
- Executes the file curation policy - changing objects from Co-Resident state to Non-Resident state
- Rebuilds the Cloud services database files if you choose to do so
Note: If Cloud directory is pointing to another file system, make sure that the file system is mounted correctly before you run the restore script providing the rebuildDB parameter value to yes.[root@recovery-site-tct-node scripts]# ./mcstore_sobar_restore.sh /ibm/gpfs_tct_SOBAR1/9277128909880390775_gpfs_tctbill1_03-14-18-17-03-01.tar gpfs_tctbill1 TCTNodeClassPowerLE yes /ibm/gpfs_tct_SOBAR1 >> /root/status.txt etc... [I] RESTORE:[I] This task restored 1310720 inodes [I] A total of 307 PDRs from filelist /dev/null have been processed; 0 'skipped' records and/or errors. [I] Finishing restore with conclude operations. [I] CONCLUDE:[I] Starting image restore pipeline [I] A total of 307 files have been migrated, deleted or processed by an EXTERNAL EXEC/script; 0 'skipped' files and/or errors. Fri Mar 16 17:20:29 CDT 2018: mmumount: Unmounting file systems ... Fri Mar 16 17:20:33 CDT 2018: mmmount: Mounting file systems ... etc..... Running file curation policy and converting co-resident files to Non resident. This will take some time. Please wait until this completes.. [I] GPFS Current Data Pool Utilization in KB and % Pool_Name KB_Occupied KB_Total Percent_Occupied system 327680 12287688704 0.002666734% [I] 307944153 of 410512384 inodes used: 75.014583%. [I] Loaded policy rules from /opt/ibm/MCStore/samples/CoresToNonres.sobar.template. Evaluating policy rules with CURRENT_TIMESTAMP = 2018-03-16@22:24:28 UTC Parsed 2 policy rules. etc... Completed file curation policy execution of converting co-resident files to Non resident files. running rebuild db for all the tiering containers for the given file system : gpfs_tctbill1 Running rebuild db for container pairset : powerlebill1spill2 and File System: gpfs_tctbill1 mmcloudgateway: Command completed. Running rebuild db for container pairset : powerlebill1spill1 and File System: gpfs_tctbill1 mmcloudgateway: Command completed. Running rebuild db for container pairset : powerlebill1 and File System: gpfs_tctbill1 etc...
- Repeat the mcstore_sobar_restore.sh script appropriately for each file system that you want to recover.
- The mcstore_sobar_restore.sh script does the following:
- Enable Cloud services maintenance operations on the appropriate node class being restored on the recovery site. For more information, see Configuring the maintenance windows.
-
Enable all Cloud services migration policies on the
recovery site by using the
--transparent-recalls {ENABLE}
option in the mmcloudgateway containerPairSet update command. For more information, see Binding your file system or fileset to the Cloud service by creating a container pair set.