Configuring AFM to cloud object storage

IBM Storage Scale provides capability to use the AFM to cloud object storage for files and objects. An IBM Storage Scale cluster can be configured to create AFM to cloud object storage filesets to connect to a cloud object storage such as Amazon S3, IBM Cloud® Object StorageStart of change, Microsoft Azure Blob,End of change or services that use S3 APIs.

Ensure that the following conditions are met before you configure the AFM to cloud object storage:
  • IBM Storage Scale cluster is up and running with IBM Storage Scale 5.1.0 or later. For more information, see Steps for establishing and starting your IBM Storage Scale cluster and GPFS cluster creation considerations.
  • Gateway nodes are provided in the IBM Storage Scale cluster to manage the AFM to cloud object storage replication.
  • A cloud object storage is provided with endpoints with required storage and required access and secret keys.
  • HTTP or HTTPS protocols are configured on a cloud object storage.
  • Security and firewall settings are configured properly for seamless connectivity between an IBM Storage Scale cluster and a cloud object storage.
Considerations
  • An IBM Storage Scale cluster hosts an AFM to cloud object storage fileset and a cloud object storage that are two different entities. The fileset and the cloud object storage are connected over internet or WAN.
  • There are no special considerations on the cluster setup for using the AFM to cloud object storage functions. AFM to cloud object storage functions are available in all editions of IBM Storage Scale. You can install IBM Storage Scale and deploy protocols either manually or by using the installation toolkit. For more information, see Steps for establishing and starting your IBM Storage Scale cluster.
  • After all nodes are upgraded to the new code, you must finalize the upgrade. For more information about the upgrade, see Completing the upgrade to a new level of IBM Storage Scale.
  • Nodes must be identified on the AFM to cloud object storage cluster that can act as gateway nodes. Gateway nodes can be configured in the cache cluster before you create filesets and start applications.
  • In case of AFM to cloud object storage, backend services are accessible but most endpoints are not able to ping from the AFM cluster. A user with an administrator privilege can set afmSkipHomePing parameter to yes (mmchconfig afmSkipHomePing=yes ). After the configuration ((mmchconfig afmSkipHomePing=yes ) is completed, this AFM to cloud object storage will not ping the endpoints and the fileset will not go into disconnected state.

For the AFM to cloud object storage configuration, you can use the mmafmcosaccess, mmafmcosctl, mmafmcosconfig, and mmafmcoskeys commands. After the access and secret keys are provided on the cloud object storage endpoints, these commands can be directly used to configure the AFM to cloud object storage. For more information about the cloud object storage commands, see mmafmcosaccess command, mmafmcosconfig command, mmafmcosctl command, and mmafmcoskeys command.

  1. Ensure that at all nodes in a cluster are up and running with IBM Storage Scale 5.1.0 and the file system level is 5.1.0 or later.
    1. To get the cluster information, issue the following command:
      # mmlscluster
      A sample output is as follows:
      GPFS cluster information
      ========================
      GPFS cluster name: afm2cos.cluster
      GPFS cluster id: 6470607479877415271
      GPFS UID domain: afm2cos.cluster
      Remote shell command: /usr/bin/ssh
      Remote file copy command: /usr/bin/scp
      Repository type: CCR
      
      GPFS cluster configuration servers:
      -----------------------------------
      Primary server:   Node3
      Secondary server: Node4
      
      Node Daemon node  IP address      Admin node  Designation
           name                         name
      ----------------------------------------------------------------------
      1   Node2         192.168.105.62  Node2      quorum-manager
      2   Node5         192.168.105.65  Node5      quorum-gateway
      3   Node4         192.168.105.64  Node4      quorum-gateway
      4   Node3         192.168.105.63  Node3      quorum-manager
      Note: In this cluster, the Node4 and Node5 gateway nodes are provided.
    2. To check the nodes state, issue the following command:
      # mmgetstate -a
      A sample output is as follows:
      
      Node number Node name GPFS state
      -------------------------------------------
      1           Node2     active
      2           Node5     active
      3           Node4     active
      4           Node3     active
    3. To get the file system information, issue the following command:
      # mmlsfs fs1 -V -T
      A sample output is as follows:
      
      flag   value           description
      ------------------- ------------------------ -----------------------------------
      -V    24.00 (5.1.0.0)  File system version
      -T    /gpfs/fs1        Default mount point
  2. Configure cloud object storage endpoints. These endpoints can be configured by using the management or configuration system that each cloud object storage provides.
    Cloud Object Server Endpoint : http://192.168.118.121
    ACCESS_KEY : myexampleaccesskey1234$
    SECRET_KEY : myexamplesecretkey1234567890#
    Endpoint Port Number : 80
    A bucket can be created on a cloud object storage before you configure a fileset or it can be created from the IBM Storage Scale cluster.
  3. Configure the AFM to cloud object storage by using the keys that are created or generated on the cloud object storage.
    1. To store the keys on the buckets in the IBM Storage Scale cluster, issue the following command:
      # mmafmcoskeys afmtocos1 set myexampleaccesskey1234$ myexamplesecretkey1234567890#
      Note: The bucket can exist on the cloud object storage or can be created from the AFM to cloud object storage setup.
      Where:
      afmtocos1
      Name of the bucket
      Access key
      myexampleaccesskey1234$
      Secret key
      myexamplesecretkey1234567890#
    2. To check that the correct keys are set on the bucket, you can issue the following command:
    # mmafmcoskeys afmtocos1 get myexampleaccesskey1234$:myexamplesecretkey1234567890#
  4. After the keys are set, to configure the AFM to cloud object storage, issue the following command:
    # mmafmcosconfig fs1 afmtocos1 --endpoint http://192.168.118.121 --uid 0 --gid 0 --new-bucket afmtocos1 --mode iw --object-fs
    Where:
    fs1
    Name of a file system.
    afmtocos1
    Name of a fileset name.
    iw
    The independent writer mode of the AFM to cloud object storage fileset.
    Note: This command is run on the Node5 node, which becomes a gateway node for the afm2cos1 fileset.
    After the AFM to cloud object storage setup is done, you can see information about a relation fileset by issuing the following command:
    # mmlsfileset fs1 afmtocos1 --afm -L
    A sample output is as follows:
    Filesets in file system 'fs1':
    Attributes for fileset afmtocos1:
    ======================================================================
    Status                          Linked
    Path                            /gpfs/fs1/afmtocos1
    Id 1
    Root inode                       524291
    Parent Id                        0
    Created                          Wed Sep 9 05:21:15 2020
    Comment
    Inode space                      1
    Maximum number of inodes         100352
    Allocated inodes                 100352
    Permission change flag           chmodAndSetacl
    afm-associated Yes
    Target                           http://192.168.118.121:80/afmtocos1
    Mode independent-writer
    File Lookup Refresh Interval     120
    File Open Refresh Interval       120
    Dir Lookup Refresh Interval      120
    Dir Open Refresh Interval        120
    Async Delay                      15 (default)
    Last pSnapId                     0
    Display Home Snapshots           no
    Parallel Read Chunk Size         0
    Number of Gateway Flush Threads  4
    Prefetch Threshold               0 (default)
    Eviction Enabled                 yes (default)
    Parallel Write Chunk Size        0
    IO Flags                         0x4000000 (afmObjectSubdir)
  5. Create some objects by using the AFM to cloud object storage fileset.
    # dd if=/dev/urandom of=/gpfs/fs1/afmtocos1/object1 count=4 bs=256K
    # dd if=/dev/urandom of=/gpfs/fs1/afmtocos1/object2 count=8 bs=256K
    # dd if=/dev/urandom of=/gpfs/fs1/afmtocos1/object3 count=12 bs=256K
    These created objects are replicated to the cloud object storage asynchronously and the cache state is dirty, then they are being replicated.

    To check the cache state, issue the following command:

    # mmafmctl fs1 getstate
    A sample output is as follows:
    
    Fileset Name Fileset Target                      Cache State   Gateway Node Queue Length Queue numExec
    ------------ ----------------------------------- ------------- ------------ ------------ -------------
    afmtocos1    http://192.168.118.121:80/afmtocos1 Dirty         c7f2n05       3            6
    When all the operations or objects that are created are synced to a cloud object storage, the cache state becomes Active. To check the cache state, issue the following command:
    # mmafmctl fs1 getstate
    A sample output is as follows:
    
    Fileset Name Fileset Target                      Cache State   Gateway Node Queue Length Queue numExec
    ------------ ----------------------------------- ------------- ------------ ------------ -------------
    afmtocos1    http://192.168.118.121:80/afmtocos1 Dirty         c7f2n05       0            9
    To get the fileset contents, issue the following command:
    # ls -lsh /gpfs/fs1/afmtocos1
    A sample output is as follows:
    total 5.0M
    1.0M -rw-r--r-- 1 root root 1.0M Sep 9 05:27 object1
    2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object2
    3.0M -rw-r--r-- 1 root root 3.0M Sep 9 05:28 object3
  6. Check that objects are replicated and synchronized with a cloud object storage by using different APIs or GUI that the cloud object storage provides.
    Example:
    <xml>
    Name : afmtocos1/
    Date : 2020-09-09 05:20:17 EDT
    Size : 0 B
    Type : Bucket
    Name : object1
    Date : 2020-09-09 05:24:17 EDT
    Size : 1.0 MiB
    ETag : 36b30c1b8016f0cc4ca41bec0d12f588
    Type : file
    Metadata :
    Content-Type: application/octet-stream
    Name : object2
    Date : 2020-09-09 05:24:34 EDT
    Size : 2.0 MiB
    ETag : 9d17d1fd443287a83445c4616864eb72
    Type : file
    Metadata :
    Content-Type: application/octet-stream
    Name : object3
    Date : 2020-09-09 05:24:44 EDT
    Size : 2.0 MiB
    ETag : e8b894cf47871a56f0a9c48bc99bfea6
    Type : file
    Metadata :
    Content-Type: application/octet-stream
  7. Read the object that is created on the cloud object storage on the AFM to cloud object storage fileset.
    In the following example, objectcreatedonCOS1, objectcreatedonCOS2, and objectcreatedonCOS3 are new objects that are created on the cloud object storage:
    1.0MiB object1
    2.0MiB object2
    2.0MiB object3
    1.0MiB objectcreatedonCOS1
    2.0MiB objectcreatedonCOS2
    3.0MiB objectcreatedonCOS3
    To get contents of a fileset on the IBM Storage Scale cluster, issue the following command:
    # ls -lsh /gpfs/fs1/afmtocos1
    A sample output is as follows:
    total 5.0M
    1.0M -rw-r--r-- 1 root root 1.0M Sep 9 05:27 object1
    2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object2
    2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object3
    0    -rwx------ 1 root root 1.0M Sep 9 2020 objectcreatedonCOS1
    0    -rwx------ 1 root root 2.0M Sep 9 2020 objectcreatedonCOS2
    0    -rwx------ 1 root root 3.0M Sep 9 2020 objectcreatedonCOS3
    Now you can see the object metadata in the AFM to cloud object storage fileset and its contents are not cached. When the objects are read, all the data is pulled in from the cloud object storage. This is on demand from applications that are hosted on IBM Storage Scale.
    An example of reading the objects by the applications is as follows:
    # cat /gpfs/fs1/afmtocos1/objectcreatedonCOS1 > /dev/null
    # cat /gpfs/fs1/afmtocos1/objectcreatedonCOS2 > /dev/null
    # cat /gpfs/fs1/afmtocos1/objectcreatedonCOS3 > /dev/null
    
    To get the contents a fileset, issue the following command:
    # ls -lsh /gpfs/fs1/afmtocos1
    A sample output is as follows:
    total 11M
    1.0M -rw-r--r-- 1 root root 1.0M Sep 9 05:27 object1
    2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object2
    2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object3
    1.0M -rwx------ 1 root root 1.0M Sep 9 2020 objectcreatedonCOS1
    2.0M -rwx------ 1 root root 2.0M Sep 9 2020 objectcreatedonCOS2
    3.0M -rwx------ 1 root root 3.0M Sep 9 2020 objectcreatedonCOS3
  8. Download the objects that are created on a cloud object storage on priority or preference by using the mmafmcosctl download command.
    1. Create new objects with a .imp extension on a cloud object storage.
      1.0MiB object1
      2.0MiB object2
      2.0MiB object3
      1.0MiB objectcreatedonCOS1
      1.0MiB objectcreatedonCOS1.imp
      2.0MiB objectcreatedonCOS2
      2.0MiB objectcreatedonCOS2.imp
      3.0MiB objectcreatedonCOS3
      3.0MiB objectcreatedonCOS3.imp
    2. Download or prefetch the object list that is created on the IBM Storage Scale cluster by issuing the following command:
      # cat ObjectList
      A sample output is as follows:
      /gpfs/fs1/afmtocos1/objectcreatedonCOS1.imp
      /gpfs/fs1/afmtocos1/objectcreatedonCOS2.imp
      /gpfs/fs1/afmtocos1/objectcreatedonCOS3.imp
    3. To download the objects, issue the mmafmcosctl command:
      # mmafmcosctl fs1 afmtocos1 /gpfs/fs1/afmtocos1/ download --object-list ObjectList --data
      A sample output is as follows:
      
      Queued (Total) Failed TotalData
                           (approx in Bytes)
      0      (0)     0       0
      3      (0)     0       6291456
      Object Downloads successfully queued at the gateway.
    4. To check the cache state, issue the following command:
      # mmafmctl  fs1 getstate
      A sample output is as follows:
      
      Fileset Name Fileset Target                      Cache State   Gateway Node Queue Length Queue numExec
      ------------ ----------------------------------- ------------- ------------ ------------ -------------
      afmtocos1    http://192.168.118.121:80/afmtocos1 Active        c7f2n05       0            71
    5. To get the contents of the fileset, issue the following command:
      # ls -lsh /gpfs/fs1/afmtocos1
      A sample output is as follows:
      total 17M
      1.0M -rw-r--r-- 1 root root 1.0M Sep 9 05:27 object1
      2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object2
      2.0M -rw-r--r-- 1 root root 2.0M Sep 9 05:28 object3
      1.0M -rwx------ 1 root root 1.0M Sep 9 2020 objectcreatedonCOS1
      1.0M -rwx------ 1 root root 1.0M Sep 9 2020 objectcreatedonCOS1.imp
      2.0M -rwx------ 1 root root 2.0M Sep 9 2020 objectcreatedonCOS2
      2.0M -rwx------ 1 root root 2.0M Sep 9 2020 objectcreatedonCOS2.imp
      3.0M -rwx------ 1 root root 3.0M Sep 9 2020 objectcreatedonCOS3
      3.0M -rwx------ 1 root root 3.0M Sep 9 2020 objectcreatedonCOS3.imp
    Note: With the objectFS mode, objects can be read on demand from a cloud object storage. Whereas the ObjectOnly mode download and upload can be used for priority data sync depending upon the mode of the AFM to cloud object storage fileset.