Start of change

Listing files exported to the cloud

This topic describes how to parse a manifest file and how to list files from the cloud.

Although files are exported to the cloud from a IBM Spectrum Scale™ environment, the files can be imported by a non-IBM Spectrum Scale application. While you export files to the cloud, a manifest file is Start of changebuiltEnd of change. The manifest file includes a list of these exported files and the metadata associated with Start of changenative object storageEnd of change.

When data is exported to the cloud, the manifest file is not automatically pushed to the cloud. You must decide when and where to export the manifest file.

When to transfer: If you are using a policy to export data, a good time to export the manifest is immediately after the policy has successfully executed your executive chain. Waiting too long can result in manifest that is too big and that does not provide frequent enough guidance to applications looking for notifications about new data on the cloud. Constantly pushing out new manifests can create other problems where the applications have to deal with many small manifests, and having to understand which they should use.

Where to transfer: Unlike transparent cloud tiering, cloud data sharing allows data to be transferred to any container at any time. This freedom can be very useful, especially when setting up multiple tenants. A centralized manifest is useful in a single tenant environment, but when there are multiple tenants with different access privileges to different files it may be better to split up your manifest destinations accordingly. Export all data targeted to a particular tenant and then send the manifest. Export data for the next tenant, and so forth.

Start of changeThe manifest file is a text file whose entry format is as follows:
<File/Object Name> <CloudContainerName> <TagID> <TimeStamp><Newline>
End of change

Start of changeTypically, this file is not accessed directly but rather is accessed using the manifest utility.End of change

A manifest Start of changeutility produces a CSV streamEnd of change entry format is as follows:
<TagID>,<CloudContainerName>,<TimeStamp>,<File/Object Name><newline>
where,
  • TagID is an optional identifier the object is associated with.
  • CloudContainerName is the name of the container the object was exported into.
  • TimeStamp follows the format : "DD MON YYYY HH:MM:SS GMT".
  • File/Object Name can contain commas, but not new line characters.
An example entry in a manifest Start of changeutility stream outputEnd of change is as follows:
0, imagecontainer, 6 Sep 2016 20:31:45 GMT, images/a/cat.scan
You can use the mmcloudmanifest tool to parse the manifest file that is created by the mmcloudgateway files export command or by any other means. By looking at the manifest files, an application can download the desired files from the cloud.
The mmcloudmanifest tool is automatically installed on your cluster along with Transparent cloud tiering rpms. However, you must install the following packages for the tool to work:
The syntax of the tool is as follows:
mmcloudmanifest
ManifestName [--cloud --properties-file PropertiesFile --manifest-container ManifestContainer 
[--persist-path PersistPath]]
[--tag-filter TagFilter] [--container-filter ContainerFilter]
[--from-time FromTime] [--path-filter PathFilter]
[--help]
where,
  • ManifestName: Specifies the name of the manifest object that is there on the cloud. For using a local manifest file, specify the full path name to the manifest file.
  • --properties-file PropertiesFile: Specifies the location of the properties file to be used when retrieving the manifest file from the cloud. A template properties file is located at /opt/ibm/MCStore/scripts/provider.properties. This file includes details such as the name of the cloud storage provider, credentials, and URL.
  • --persist-path PersistPath: Stores a local copy of the manifest file that is retrieved from the cloud in the specified location.
  • --manifest-container ManifestContainer: Name of the container in which the manifest is located.
  • --tag-filter TagFilter: Lists only the entries whose Tag ID # matches the specified regular expression (regex).
  • --container-filter ContainerFilter: Lists only the entries whose container name matches the specified regex.
  • --from-time FromTime: Lists only the entries that occur starting at or after the specified time stamp. The time stamp must be enclosed within quotations, and it must be in the 'DD MONYYYY HH:MM:SS GMT' format. Example: '21 Aug 2016 06:23:59 GMT'
  • --path-filter PathFilter: Lists only the entries whose path name matches the specified regex.
The following command exports four CSV files tagged with "us-weather", along with the manifest file, "manifest.text", to the cloud:
mmcloudgateway files export --container arn8781724981111500553 --manifest-file manifest.txt 
--tag us-weather /gpfs/weather_data/MetData_Oct06-2016-Oct07-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct07-2016-Oct08-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct08-2016-Oct09-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct09-2016-Oct10-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct10-2016-Oct11-2016-ALL.csv
The following command exports four CSV files tagged with "uk-weather", along with the manifest file, "manifest.text", to the cloud:
mmcloudgateway files export --container arn8781724981111500553 --manifest-file manifest.txt 
--tag uk-weather /gpfs/weather_data/MetData_Oct06-2016-Oct07-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct07-2016-Oct08-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct08-2016-Oct09-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct09-2016-Oct10-2016-ALL.csv 
/gpfs/weather_data/MetData_Oct10-2016-Oct11-2016-ALL.csv

So, the container "arn8781724981111500553" contains both US and UK weather data.

The following command parses the manifest file and imports the files that are tagged with "us-weather" to the local file system under the /gpfs directory:
mmcloudmanifest parse-manifest manifest.txt --tag-filter us-weather
 | xargs mmcloudgateway files import --directory /gpfs --container arun8781724981111500553
You can verify these files by using the following command:
ls -l /gpfs
The system displays output similar to this:
total 64
drwxr-xr-x. 2 root root  4096 Oct  5 07:09 automountdir
-rw-r--r--. 1 root root  7859 Oct 18 02:15 MetData_Oct06-2016-Oct07-2016-ALL.csv
-rw-r--r--. 1 root root  7859 Oct 18 02:15 MetData_Oct07-2016-Oct08-2016-ALL.csv
-rw-r--r--. 1 root root 14461 Oct 18 02:15 MetData_Oct08-2016-Oct09-2016-ALL.csv
-rw-r--r--. 1 root root 14382 Oct 18 02:15 MetData_Oct09-2016-Oct10-2016-ALL.csv
-rw-r--r--. 1 root root 14504 Oct 18 02:15 MetData_Oct10-2016-Oct11-2016-ALL.csv
drwxr-xr-x. 2 root root  4096 Oct 17 14:12 weather_data
End of change