Data migration by using AFM migration enhancements

This use case details the recommended process of the data migration from any legacy appliances or old GPFS system to the latest IBM Spectrum Scale by using IBM Spectrum Scale active file management (AFM). The migration is required when you replace an old hardware setup or upgrade the setup. This document also provides the steps to migrate data from one file system to other file system that belongs to the same cluster. The process is updated to add the latest options that ease and simplify the overall process.

Prerequisites

  • The data source or the old hardware can be either an IBM Spectrum Scale cluster or a non-IBM Spectrum Scale setup.
  • The source cluster can export the source path by using either NFS v3 or GPFS multi-cluster configuration, as applicable.
  • The target or the new cluster must be running IBM Spectrum Scale 5.0.4.3 or later.
  • The user ID namespace between the source site and the target site must be configured identically.

Overview

  1. Prepare the old hardware (system) to export the data source. This site is called the home site (old system).
  2. Prepare a new hardware (system) that runs IBM Spectrum Scale AFM. This is called the cache site (new system), and data is migrated from an old system to a new system.
  3. If required, migrate data from a file system to another file system that belongs to the same IBM Spectrum Scale cluster.
  4. Set up the new system, and configure an AFM RO-mode fileset relationship between the old system and the new system.
  5. Migrate data from the old system to the new system recursively by using the latest prefetch options.
  6. Convert the AFM RO-mode fileset to an AFM LU-mode fileset.
  7. Move the application from the old system to the new system (AFM LU-mode fileset). Take downtime for the application cutover. During this phase, it is recommended that the old system must not modify the data.
  8. Prefetch the remaining data. If the data is not available at the new system, AFM pulls the data on demand for the application during the final prefetch from the old system.
  9. Prepare downtime for the application. Disconnect the old system and disable the AFM relationship. This step is optional, and the AFM relationship can remain in the stopped state until a planned downtime.
Note:
  • The migration process does not migrate any file system-specific parameters such as quotas, snapshots, file system-level tuning parameters, policies, fileset definitions, encryption keys from an old system to a new system.
  • From a GPFS data source home, AFM can migrate all the user-extended attributes, ACLs, file sparseness and pre-allocated files.
  • From a non-GPFS data source home, only the POSIX permissions and the ACLs are migrated. Sparseness and preallocation of file is not maintained.
  • AFM migrates the data as root by bypassing the permission checks. Therefore, the no_root_squash option is required while exporting the data source at an old system by using NFS.
  • Prefetch can be run in-parallel from multiple gateway nodes for the same AFM fileset.
  • The data migration from one file system to another file system is similar as the data migration from an old system to a new system. However, it has a small change for the GPFS (NSD) protocol where the multi-cluster setup is not required. An AFM fileset (f1) is created on the target file system by using the GPFS (NSD) protocol and pointing it to the source file system (fs2). For the NFS protocol, you need to export source file system path (fs2) from a node and create an AFM fileset on a target file system (fs1).

At home (old system)

For non-GPFS home site

If the home (old system) is a non-GPFS site, configure NFS exports of the data source path, for example, /home/userData by adding the following line in the /etc/exports file and restart NFS services. Each export entry must have a unique fileset ID (fsid).

Example:
  1. Update the /etc/exports file and add the following line:
    /home/userData GatewayIP/*(rw,nohide,insecure,no_subtree_check,sync,no_root_squash,fsid=101)
  2. Restart the NFS server by issuing the following command:
    # exportfs -ra
For GPFS home site
  • If the home (old system) is a GPFS site, do the following steps:
    1. Export a fileset that contains the source data by using NFS or configure GPFS multi-cluster between sites. For more information about the NFS protocol use, see Non-GPFS home site.
    2. If data from a file system needs to be migrated to another file system that belongs to the same IBM Spectrum Scale cluster, you can configure an AFM relationship by using either NFS or GPFS protocol as follows:
      • To set up an AFM relationship by using NFS protocol, export the source file system data path by using NFS on one of the cluster nodes. That is modify the /etc/exports file. For more information, see Non-GPFS home site.
      • To set up an AFM relationship by using GPFS (NSD) protocol on the same cluster, you need not to set up multi-cluster protocol because there is no remote cluster available. In this case, the AFM fileset target is a file system from the local cluster instead of a remote cluster.
    3. If the home (old system) site is running IBM Spectrum Scale 4.1 version or later, issue the following command:
      # /usr/lpp/mmfs/bin/mmafmconfig enable /GPFS_PATH/Fileset
    4. If the source node or cluster is running on IBM GPFS 3.4 or 3.5, issue the following command:
      # /usr/lpp/mmfs/bin/mmafmhomeconfig enable /GPFS_PATH/Fileset

Ensure that the NFS exports from the old system are readable at the AFM cache cluster so that the AFM gateway can mount the NFS exports by using NFS v3 and read data from the exports for the migration.

At cache (new system)

Do the following steps at the cache:
  1. Create an AFM RO-mode fileset at the AFM cache cluster by using either NFS or GPFS (NSD) protocol.
    Issue the following commands to create an AFM RO-mode fileset by using the NFS protocol:
    # mmcrfileset fs1 RO-1 -p afmMode=RO,afmTarget=NFS-Server:/home/userData
    --inode-space new --inode-limit [--inode-limit MaxNumInodes[:NumInodesToPreallocate]]
    # mmlinkfileset fs1 RO-1 -J /gpfs/fs1/RO-1
    Issue the following commands to create an AFM RO-mode fileset by using the GPFS protocol:
    # mmcrfileset fs1 RO-1 -p afmMode=RO,afmTarget=gpfs:///remote-fs-path/userData --inode-space new [--inode-limit MaxNumInodes[:NumInodesToPreallocate]]
    # mmlinkfileset fs1 RO-1 -J /gpfs/fs1/RO-1
    In this case, the remote home is a remote-mounted file system.
  2. If the source file system and the destination file system belong to the same cluster, then steps to configure an AFM relationship by using either NFS or GPFS protocol are mostly similar.

    In case of the NFS protocol, the NFS source export node and the gateway belong to the same cluster but the file system belongs to a different cluster.

    Example:
    # mmcrfileset fs1 RO-1 -p afmMode=RO,afmTarget=NFS-Server:/gpfs/fs2/userData
    --inode-space new [--inode-limit  MaxNumInodes[:NumInodesToPreallocate]]
    # mmlinkfileset fs1 RO-1 -J /gpfs/fs1/RO-1

    If source and target file systems are from the same cluster, AFM can be still configured by using the GPFS protocol.

    Example:
    # mmcrfileset fs1 RO-1 -p afmMode=RO,afmTarget=gpfs:///gpfs/fs2/userData
    --inode-space new [--inode-limit MaxNumInodes[:NumInodesToPreallocate]]
    # mmlinkfileset fs1 RO-1 -J /gpfs/fs1/RO-1
    In this example, fs1 is a new file system and fs2 is an old file system from the same cluster.
  3. Disable the afmEnableAutoEviction parameter at the AFM cache cluster to avoid an inadvertent eviction.
    # mmchfileset fs1 RO-1 -p afmEnableAutoEviction=no
  4. To migrate all the data to the AFM RO-mode fileset to the new system, issue the following command:
    mmafmctl Device prefetch
    The data can be prefetched by using various options with the mmafmctl command such as --directory, --dir-list-file, --list-file, --home-list-file, --home-inode-file. For more information, see mmafmctl command.

    To simplify the migration process, it is recommended to use the --directory and --list-file options with the mmafmctl prefetch command recursively to generate a list and queue them to the gateway node to migrate the data to the new system.

  5. Migration of whole data might be outlined for directories, subdirectories, and files, then they can be prefetched recursively so that most of the data is migrated from the home to the cache. To prefetch data from the home to the cache issue the mmafmctl command by using the --directory and --list-file options as follows:
    Note: While generating a list file, remove any occurrence of root directory such as “.” or “..” from the generated list file. This special file entry should not be prefetched and must be removed from the list file, otherwise prefetch will mark this file as a failed file and logs an error in the /var/adm/ras/mmfs.log file.
    1. To find all the subdirectories and files inside the specified directory recursively, use the --directory option. When this option is used, all the subdirectories and files belong to the directory are queued to the gateway node to migrate them to the cache.
      Example:
      # mmafmctl fs1 prefetch -j RO-1 --directory /gpfs/fs1/RO-1/DIR1
      A sample output is as follows:
      mmafmctl: Performing prefetching of fileset: RO-1
      Queued(Total)		Failed	TotalData (approx inBytes)
      0(100000)	     	0	            	 4403600
    2. To specify a list of files, use the --list-file option. The list of files can be generated at the old system or the new system by running a find command or GPFS mmapplypolicy command. By running either command at the new system, AFM sends a readdir operation to the old system and migrates the directory tree structure to the new system, however, it does not migrate data. When the list file is available, run the following command:
      Example:
      # mmafmctl FileSystem prefetch -j fileset --enable-failed-file-list --list-file List-file-path
      A sample output is as follows:
      mmafmctl: Performing prefetching of fileset: <fileset>
      Queued (Total) 	Failed		 TotalData (approx in Bytes)
      0 (56324)        	0      		0
      5 (56324)        	2      		1353559
      56322 (56324)    	2      		14119335
      These stats/counters are shown while the command is running. The command exits after the prefetch statistics is shown.
  6. Specify the --enable-failed-file-list option to generate a list of all files that failed and are not prefetched at the new system during this operation. This option helps in case any of the files was not prefetched because of an error such as network disconnect or intermittent failure. You can retry to prefetch only the failed files by using a failed-file list, which is generated internally.
    The files from an old system are prefetched in the following two phases:
    • Phase 1: AFM first collects the information of all files that needs to be prefetched and queues them on the gateway node.
    • Phase 2: When the files are queued on the gateway node, the gateway node runs the prefetch from the old system to the new system.
    The failed-file list is generated only if any file that was successfully queued to the gateway node but failed during the prefetch to the new system, that is, Phase 2. The failed file is not be generated during the queuing phase 2. AFM collects the failed file list under /tmp and prints the new path when the remaining files are queued.
    Example:
    # mmafmctl fs1 prefetch -j RO-1 --list-file /home/list-file --enable-failed-file-list
    # mmafmctl fs1 prefetch -j RO-1 --directory /gpfs/fs1/RO-1/DIR1 --enable-failed-file-list
  7. Prefetch the failed files by using the --retry-failed-file-list option.
    During the prefetch operation, if any of the files failed to prefetch from the old system, then this the failed file entry is added to a special file. This special file is created under the AFM RO-1 fileset, for example, /gpfs/fs1/RO-1/.afm/.prefetchedfailed. You can retry prefetch operation to prefetch only the failed files by using the following command:
    # mmafmctl fs1 prefetch -j RO-1 --retry-failed-file-list
  8. If the list file is generated by running a GPFS mmapplypolicy command, then you can specify the --policy option to the mmafmctl command so that the sequences such as '\' is converted into '\\' or '\n' is converted into '\\n'. If this option is specified, it is assumed that the input file list contains already escaped path names. The path of each file is unescaped before the file is queued to the gateway node for the prefetch operation.
    # mmafmctl fs1 prefetch -j RO-1 --list-file List-file-path --enable-failed-file-list --policy
  9. Now, check the prefetch status and ensure that all the prefetch requests are completed and no request is pending in the queue before you go the next step. To check the status of the prefetch command, issue the following command:
    # mmafmctl fs1 prefetch -j RO-1
  10. After most of the data is migrated to the new system, prepare an AFM fileset for the fileset mode conversion at the new system. Before the fileset mode conversion, the prefetch status must be checked and ensure that all operations are completed successfully. The conversion to the AFM LU mode makes the fileset locally writable which means data that is written to the AFM LU-mode fileset will not be synced back to the old system. The AFM fileset must be readable-writable because after the application is moved to the AFM LU mode fileset, the application should be able to modify the data because data modification is not possible in the AFM RO mode fileset. The data in the AFM LU mode becomes read/write but this data does not queue to the old system. The new data becomes available only at the LU fileset whereas the old remaining data can still be prefetched. The steps to convert the AFM RO-mode fileset requires unlinking and relinking of the AFM fileset. For a short time, data will not be available when the AFM fileset is unlinked. When the fileset mode is converted and relinked, the data is available to the application. Therefore, the downtime is less.
    Example:
    1. Unlink the AFM RO-mode fileset by issuing the following command:
      # mmunlinkfileset fs1 RO-1 -f
    2. Convert the AFM RO-mode fileset into the AFM LU-mode fileset by issuing the following command:
      # mmchfileset fs1 RO-1 -p afmMode=lu
    3. Relink the AFM RO-mode fileset by issuing the following command:
      # mmlinkfileset fs1 RO-1 -J /gpfs/fs1/RO-1
    4. Confirm the conversion from the RO mode to the LU mode by issuing the following command. Check that the mode listed in the Mode row.
      # mmlsfileset fs1 RO-1 -L –afm
      A sample output is as follows:
      Filesets in file system 'fs1':
      Attributes for fileset RO-1:
      =============================
      Status                     			Linked
      Path                                    	/gpfs/fs1/RO-1
      Id                                      	17
      Root inode                              	5767171
      Parent Id                               	0
      Created                          		Sat Feb 29 11:44:20 2020
      Comment
      Inode space                             	11
      Maximum number of inodes                	100096
      Allocated inodes                        	100096
      Permission change flag                  	chmodAndSetacl
      afm-associated                          	Yes
      Target                                    nfs://<Home>/gpfs/fs1/RO-1
      Mode                                    	local-updates
      File Lookup Refresh Interval            	30 (default)
      File Open Refresh Interval              	60 (default)
      Dir Lookup Refresh Interval             	60 (default)
      Dir Open Refresh Interval               	60 (default)
      Async Delay                             	disable
      Last pSnapId                           	0
      Display Home Snapshots                 	yes (default)
      Number of Gateway Flush Threads        	4
      Prefetch Threshold                      	0 (default)
      Eviction Enabled                        	yes (default)
      IO Flags                              41984 (refreshOnce,readdirOnce)
      
  11. After the AFM RO-mode fileset conversion, set the following parameters on the AFM LU-mode fileset.
    During the migration, if the old system has many files and some of these need to be still migrated to the new system for migration, the lookup and readdir operations might take more time to complete. AFM has the following tunables that reduce the migration time. These tunables need to be set on the fileset at the new system before the application is moved from the old system.
    afmCheckRefreshDisable
    This parameter enables revalidation of the file entries inside the directory at the old system. But for the AFM LU mode, if the file is dirty, it does not need to be revalidated with the home. You must set this parameter to ‘no’. This parameter is set at the cluster level.

    To disable this parameter at the cluster level, issue the following command:

    # mmchconfig afmCheckRefreshDisable=no -i
    afmRefreshOnce
    After the cutover when the application is moved to the new system (later step), it is expected that the home is not modified. This parameter enables revalidating with the old system only a single time and improves the application performance. This parameter is set at the AFM fileset.
    To set this parameter on an AFM fileset, issue the following command:
    # mmchfileset device fileset -p afmRefreshOnce=yes
    afmReaddirOnce
    After the cutover, it is expected that the home is not modified. This parameter enables performing readdir of the directory entries a single time and improves the application performance. This parameter is set on an AFM fileset.
    To set this parameter on an AFM fileset, issue the following command:
    # mmchfileset device fileset -p afmReaddirOnce=yes

    The data that is required to run the application is migrated to the new system (For more information, see step 9.). Therefore, the application can be moved from the old system to the new system and the operations can be restarted. The prefetch operation migrated most of the data from the old system at this time.

  12. Until now, not all the data is migrated to the new system, rerun the prefetch commands to migrate all the remaining data from the old system to the new system. In some cases, data might have been created latest at the old system. This data must be prefetched from the old system.
    1. Restart the prefetch operation to bring all the remaining data from the old system. For more information, see the steps 5 and 9.
    2. After all the data is migrated to the new system, you can stop the migration and can break the AFM relationship.
    Note: If the remaining old data from the old system is not prefetched, AFM prefetches the old data from the home on-demand during the final prefetch operation.
  13. At this stage, all the data from the old system must be migrated to the new System (AFM cache site). Do the following steps to check whether all the data is migrated to the new system and prefetch the remaining data:
    1. Issue the following command to check whether any data is not yet migrated to the new system:
      # mmafmctl fs1 checkUncached -j RO-1
      If any data is still not migrated to the new system, then this command will generate a list files that can be used to run the prefetch command. A sample output is as follows:
      Verifying if all the data is cached. This may take a while...
      mmchfileset: [E] Uncached files present, run prefetch first
      Directories list file: /var/mmfs/tmp/cmdTmpDir.mmchfileset.3241/dir-file.mmchfileset.3241
      Orphans list file: /var/mmfs/tmp/cmdTmpDir.mmchfileset.3241/orphan-file.mmchfileset.3241
      
    2. To prefetch remaining data by using the generated list files, issue following commands:
      # mmafmctl device prefetch -j RO-1 --dir-list-file /var/mmfs/tmp/cmdTmpDir.mmchfileset.3241/dir-file.mmchfileset.3241
      
      # mmafmctl device prefetch -j RO-1 --list-file /var/mmfs/tmp/cmdTmpDir.mmchfileset.3241/orphan-file.mmchfileset.3241
      
    3. To check the prefetch status, issue the following command:
      # mmafmctl device prefetch -j RO-1
  14. Plan downtime to stop the application. During the downtime, the application that uses the AFM fileset data stops for some time. You can now disassociate the AFM relationship of that fileset and link it back. After AFM is disabled, the fileset is converted to an IBM GPFS independent fileset, and all data is available locally and does not have any connection to the home. To disable the AFM relationship, do the following steps:
    1. Unlink the fileset by issuing the following command:
      # mmunlinkfileset fs1 RO-1 -f
    2. Disable the AFM relationship by issuing the following command:
      # mmchfileset fs1 RO-1 -p afmTarget=disable
    3. Relink the fileset that to be used as a GPFS independent fileset.
      # mmlinkfileset fs1 RO-1 -J FS_PATH/AFM-fileset
    For more information about disabling the AFM relationship, see Disabling AFM.
  15. Run the following examples of the prefetch operation with additional features to ease the migration process.
    1. Run a prefetch operation on a non-default gateway. You can run the prefetch operation on a selected gateway instead of the default assigned gateway. This option also enables you to run multiple prefetch operations that belong to the same AFM fileset.
      # mmafmctl fs1 prefetch -j fileset1 --directory /gpfs/fs1/fileset1/dir1 --gateway NewGateway1
    2. To display the statistics the prefetch operation, specify the --gateway option.
      # mmafmctl fs1 prefetch -j fileset1 --gateway NewGateway1
    3. Run the prefetch operation by using a user-defined number of prefetch threads. The performance is improved when the prefetch threads are increased. This option needs to be set based on a setup and configuration.
      # mmafmctl fs prefetch -j fileset --list-file listfile_path --prefetch-threads 8
    4. Run the prefetch operation on an AFM fileset to perform the readdir operation after the cutover. The prefetch operation performs readdir of a directory only a single time at the old system and brings it to the new system if modification took place after the cutover
      # mmafmctl fs prefetch -j fileset --list-file listfile_path --readdir-only
    5. To prefetch data forcefully after the cutover in the dirty directory at the new system, issue the following command. A directory at the AFM LU-fileset becomes dirty, if data inside the directory is changed. When the data inside a directory that was already migrated and modified on the new system, AFM marks this directory as a ‘dirty’ directory at the AFM LU-mode fileset on the new system. When the ‘dirty’ flag is set on a directory at the new system, the local data in the AFM LU-mode fileset is never revalidated to the old system. This option allows prefetching the data that might be modified at the old system, even if the ‘dirty’ flag is set on the directory on the AFM LU-mode fileset.
      Example:
      # mmafmctl fs prefetch -j fileset --list-file listfile_path --force
  16. If the old system is a non-IBM site, ACLs were not maintained at the new system. If ACLs can be inherited from the old system, ACLs can be applied to the AFM fileset root path at the new system. This input file is NFSv4 ACL format. To apply ACLs to the AFM fileset root path, issue the following commands:
    # mmafmlocal mmputacl -i inputFile /gpfs/fs1/RO-1
    # mmchfileset fs1 RO-1 -p afmSkipHomeAcl=yes