Creating new primary-secondary filesets, failover and failback to old primary
This use case describes creating new primary-secondary filesets, checking RPO snapshot creation after RPO timeout, failing over to secondary, and failing back to old primary.
You might choose to run md5sum or any other third-party utility to check consistency of the migrated files.
- Create primary using the mmcrfileset command:
mmcrfileset fs1 drp12 --inode-space=new -p afmtarget=gpfs:///gpfs/remotefs1/drs12 -p afmmode=primary --inode-limit=1024000 -p afmAsyncDelay=15 -p afmRPO=720
mmlsfileset fs1 drp12 -L --afmFileset drp12 created with id 23 root inode 7864323. Primary Id (afmPrimaryID) 14228043454022319638-C0A8037555D2994D-23 Primary:
Filesets in file system 'fs1': Attributes for fileset drp12: ============================== Status Unlinked Path -- Id 23 Root inode 7864323 Parent Id -- Created Tue Aug 25 01:16:06 2015 Comment Inode space 15 Maximum number of inodes 1024000 Allocated inodes 500736 Permission change flag chmodAndSetacl afm-associated Yes Target gpfs:///gpfs/remotefs1/drs12 Mode primary Async Delay 15 Recovery Point Objective 15 minutes Last pSnapId 0 Number of Gateway Flush Threads 4 Primary Id 14228043454022319638-C0A8037555D2994D-23
- Create secondary on the secondary cluster:
/usr/lpp/mmfs/bin/mmcrfileset fs1 drs12 --inode-space=new --inode-limit=1024000 -p afmMode=secondary -p afmPrimaryID=14228043454022319638-CA8037555D2994D-23
/usr/lpp/mmfs/bin/mmlinkfileset fs1 drs12 -J /fs1/drs12Fileset drs12 created with id 43 root inode 11010051.
/usr/lpp/mmfs/bin/mmlsfileset fs1 drs12 -L --afmFileset drs11 linked at /fs1/drs12
Filesets in file system 'fs1': Attributes for fileset drs12: ============================== Status Linked Path /fs1/drs12 Id 43 Root inode 11010051 Parent Id 0 Created Tue Aug 25 01:26:51 2015 Comment Inode space 21 Maximum number of inodes 1024000 Allocated inodes 501504 Permission change flag chmodAndSetacl afm-associated Yes Associated Primary ID 14228043454022319638-C0A8037555D2994D-23 Mode secondary Last pSnapId 0
- Link primary to create psnap0:
mmlinkfileset fs1 drp12 -J /fs1/drp12
mmafmctl fs1 getstate -j drp12Fileset drp12 linked at /fs1/drp12 First snapshot name is psnap0-rpo-C0A8037555D2994D-23 Flushing dirty data for snapshot drp12::psnap0-rpo-C0A8037555D2994D-23... Quiescing all file system operations. Snapshot drp12::psnap0-rpo-C0A8037555D2994D-23 created with id 36. Primary State:
Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- ------------- ------------ ------------ ------------- drp12 gpfs:///gpfs/remotefs1/drs12 Active c3m3n06 0 2
- Create data from primary and see that it goes to secondary:
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drp12/file_pri_1
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drp12/file_pri_2recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 24599.12 Kbytes/sec, thread utilization 0.979
ls -l /fs1/drp12recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 25954.27 Kbytes/sec, thread utilization 0.999 Primary contents:
ls -l /fs1/drs12total 204800 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2 Secondary contents:
total 409600 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Check psnap after the time set as RPO interval
passes:
mmlssnapshot fs1 -j drp12
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drp12/file_pri_3Snapshots in file system fs1: Directory SnapId Status Created Fileset psnap0-rpo-C0A8037555D2994D-23 36 Valid Tue Aug 25 01:27:45 2015 drp12 psnap-rpo-C0A8037555D2994D-23-15-08-25-01-41-51 37 Valid Tue Aug 25 13:27:45 2015 drp12 Create more data at primary:
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drp12/file_pri_4recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 21322.77 Kbytes/sec, thread utilization 0.978
mmafmctl fs1 getstate -j drp12recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 21617.21 Kbytes/sec, thread utilization 1.000 Primary State:
Fileset Name Fileset Target Cache State Gateway Node Queue Length Queue numExec ------------ -------------- ------------- ------------ ------------ ------------- drp12 gpfs:///gpfs/remotefs1/drs12 Dirty c3m3n06 4 40963
Note: The second RPO snapshot is not triggered. - Unlink primary feigning primary going down:
/usr/lpp/mmfs/bin/mmunlinkfileset fs1 drp12 -f
Fileset drp12 unlinked.
- Failover - convert secondary to acting primary. Run on the secondary.
/usr/lpp/mmfs/bin/mmafmctl fs1 failoverToSecondary -j drs12 --restore
/usr/lpp/mmfs/bin/mmlsfileset fs1 drs12 -L --afmmmafmctl: failoverToSecondary restoring from psnap psnap-rpo-C0A8037555D2994D-23-15-08-25-01-41-51 [2015-08-25 01:48:07] Restoring fileset "drs12" from snapshot "psnap-rpo-C0A8037555D2994D-23-15-08-25-01-41-51" of filesystem "/dev/f1" [2015-08-25 01:48:09] Scanning inodes, phase 1 ... [2015-08-25 01:48:25] 11511552 inodes have been scanned, 50% of total. [2015-08-25 01:48:41] 23023104 inodes have been scanned, 100% of total. [2015-08-25 01:48:41] Constructing operation list, phase 2 ... [2015-08-25 01:48:41] 0 operations have been added to list. [2015-08-25 01:48:41] 2 operations have been added to list. [2015-08-25 01:48:41] Deleting the newly created files, phase 3 ... [2015-08-25 01:48:42] Deleting the newly created hard links, phase 4 ... [2015-08-25 01:48:43] Splitting clone files, phase 5 ... [2015-08-25 01:48:43] Deleting the newly created clone files, phase 6 ... [2015-08-25 01:48:44] Moving files, phase 7 ... [2015-08-25 01:48:45] Reconstructing directory tree, phase 8 ... [2015-08-25 01:48:46] Moving files back to their correct positions, phase 9 ... [2015-08-25 01:48:46] Re-creating the deleted files, phase 10 ... [2015-08-25 01:48:47] Re-creating the deleted clone parent files, phase 11 ... [2015-08-25 01:48:48] Re-creating the deleted clone child files, phase 12 ... [2015-08-25 01:48:49] Re-creating the deleted hard links, phase 13 ... [2015-08-25 01:48:50] Restoring the deltas of changed files, phase 14 ... [2015-08-25 01:48:50] Restoring the attributes of files, phase 15 ... [2015-08-25 01:48:51] Restore completed successfully. [2015-08-25 01:48:51] Clean up. Primary Id (afmPrimaryID) 5802564250705647455-C0A8286F55B0E3EE-43 Fileset drs12 changed. Promoted fileset drs12 to Primary Acting primary (Please note that target is blank in mmlsfileset output):
ls -l /fs1/drs12Filesets in file system 'fs1': Attributes for fileset drs12: ============================== Status Linked Path /fs1/drs12 Id 43 Root inode 11010051 Parent Id 0 Created Tue Aug 25 01:26:51 2015 Comment Inode space 21 Maximum number of inodes 1024000 Allocated inodes 501504 Permission change flag chmodAndSetacl afm-associated Yes Target -- Mode primary Async Delay 15 (default) Recovery Point Objective disable Last pSnapId 0 Number of Gateway Flush Threads 4 Primary Id 5802564250705647455-C0A8286F55B0E3EE-43 Acting primary contents after failover:
total 409600 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Create sample data from acting primary:
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drs12/file_actingpri_1
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drs12/file_actingpri_2recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 48282.25 Kbytes/sec, thread utilization 0.991
ls -l /fs1/drs12/recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 58104.41 Kbytes/sec, thread utilization 0.999 Contents from acting primary:
total 812544 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Link old primary when it is back
mmlinkfileset fs1 drp12 -J /fs1/drp12
Fileset drp12 linked at /fs1/drp12
- Run failback start on old primary:
mmafmctl fs1 failbackToPrimary -j drp12 --start
ls -l /fs1/drp12Fileset drp12 changed. mmafmctl: failbackToPrimary restoring from psnap psnap-rpo-C0A8037555D2994D-23-15-08-25-01-41-51 [2015-08-25 01:50:52] Restoring fileset "drp12" from snapshot "psnap-rpo-C0A8037555D2994D-23-15-08-25-01-41-51" of filesystem "/dev/f1" [2015-08-25 01:50:54] Scanning inodes, phase 1 ... [2015-08-25 01:51:03] 8365056 inodes have been scanned, 50% of total. mmlssn[2015-08-25 01:51:12] 16730112 inodes have been scanned, 100% of total. [2015-08-25 01:51:12] Constructing operation list, phase 2 ... [2015-08-25 01:51:12] 0 operations have been added to list. 2015-08-25 01:51:12] 2 operations have been added to list. [2015-08-25 01:51:12] Deleting the newly created files, phase 3 ... [201-08-25 01:51:13] Deleting the newly created hard links, phase 4 ... [2015-08-25 01:51:13] Splitting clone files, phase 5 ... [2015-08-25 01:51:14] Deleting the newly created clone files, phase 6 ... [2015-08-25 01:51:15] Moving files, phase 7 ... [2015-08-25 01:51:16] Reconstructing directory tree, phase 8 ... [2015-08-25 01:51:16] Moving files back to their correct positions, phase 9 ... [2015-08-25 01:51:17] Re-creating the deleted files, phase 10 ... [2015-08-25 01:51:18] Re-creating the deleted clone parent files, phase 11 ... [2015-08-25 01:51:18] Re-creating the deleted clone child files, phase 12 ... [2015-08-25 01:51:19] Re-creating the deleted hard links, phase 13 ... [2015-08-25 01:51:20] Restoring the deltas of changed files, phase 14 ... [2015-08-25 01:51:21] Restoring the attributes of files, phase 15 ... [2015-08-25 01:51:21] Restore completed successfully. [2015-08-25 01:51:21] Clean up. Primary contents:
total 204800 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Create more data on acting primary:
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drs12/file_actingpri_3
recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 51110.26 Kbytes/sec, thread utilization 0.996
- Run applyUpdates to sync up primary to acting primary:
mmafmctl fs1 applyUpdates -j drp12
ls -l /fs1/drp12[2015-08-25 01:51:39] Getting the list of updates from the acting Primary... [2015-08-25 01:52:21] Applying the 9 updates... [2015-08-25 01:52:25] 9 updates have been applied, 100% of total. mmafmctl: Creating the failback psnap locally. failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38 Flushing dirty data for snapshot drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38... Quiescing all file system operations. Snapshot drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38 created with id 38. Primary contents:
total 512000 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Create more data on acting primary to show applications continue and then applications
stop:
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drs12/file_actingpri_4
ls -l /fs1/drs12recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 54504.44 Kbytes/sec, thread utilization 0.991 Acting primary contents:
total 1227264 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:52 file_actingpri_4 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Run last applyUpdates on old primary after applications stop.
mmafmctl fs1 applyUpdates -j drp12
ls -l /fs1/drp12[2015-08-25 01:52:43] Getting the list of updates from the acting Primary... [2015-08-25 01:53:25] Applying the 3 updates... [2015-08-25 01:53:26] 3 updates have been applied, 100% of total. mmafmctl: Creating the failback psnap locally. failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-52-42 Flushing dirty data for snapshot drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-52-42... Quiescing all file system operations. Snapshot drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-52-42 created with id 39. mmafmctl: Deleting the old failback psnap. failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38 Invalidating snapshot files in drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38... Deleting files in snapshot drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38... 100.00 % complete on Tue Aug 25 01:53:26 2015 ( 500736 inodes with total 0 MB data processed) Invalidating snapshot files in drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38/F/... Delete snapshot drp12::failback-psnap-rpo-C0A8037555D2994D-23-15-08-25-01-51-38 complete, err = 0 Primary contents:
total 614400 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:52 file_actingpri_4 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Complete failback process on old primary:
mmafmctl fs1 failbackToPrimary -j drp12 --stop
Fileset drp12 changed.
- Convert the acting primary back to secondary and establish the relationship again:
/usr/lpp/mmfs/bin/mmunlinkfileset fs1 drs12 -f
/usr/lpp/mmfs/bin/mmchfileset fs1 drs12 -p afmmode=drs,Fileset drs12 unlinked.
/usr/lpp/mmfs/bin/mmlinkfileset fs1 drs12 -J /fs1/drs12afmPrimaryID=14228043454022319638-C0A8037555D2994D-23 Fileset drs12 changed.
ls -l /fs1/drp12Fileset drs11 linked at /fs1/drs12 Primary contents:
ls -l /fs1/drs12total 614400 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:52 file_actingpri_4 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2 Secondary contents:
total 1228800 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:52 file_actingpri_4 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2
- Create data from failed back primary:
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drp12/file_pri_5
/usr/lpp/mmfs/samples/perf/gpfsperf create seq /fs1/drp12/file_pri_6recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 21235.12 Kbytes/sec, thread utilization 0.985
ls -l /fs1/drp12recSize 10K nBytes 100M fileSize 100M nProcesses 1 nThreadsPerProcess 1 file cache flushed before test not using direct I/O offsets accessed will cycle through the same file segment not using shared memory buffer not releasing byte-range token after open no fsync at end of test Data rate was 22658.18 Kbytes/sec, thread utilization 1.000 Primary contents:
ls -l /fs1/drs12total 819200 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:52 file_actingpri_4 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:55 file_pri_5 -rw-r--r-- 1 root root 104857600 Aug 25 01:55 file_pri_6 Secondary contents:
total 1638400 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:49 file_actingpri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:51 file_actingpri_3 -rw-r--r-- 1 root root 104857600 Aug 25 01:52 file_actingpri_4 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_1 -rw-r--r-- 1 root root 104857600 Aug 25 01:29 file_pri_2 -rw-r--r-- 1 root root 104857600 Aug 25 01:55 file_pri_5 -rw-r--r-- 1 root root 104857600 Aug 25 01:55 file_pri_6