Transparent cloud tiering issues

This topic describes the common issues (along with workarounds) that you might encounter while using Transparent cloud tiering.

Migration/Recall failures

If a migration or recall fails, simply retry the policy or CLI command that failed two times after clearing the condition causing the failure. This works because the Transparent cloud tiering service is idempotent.

mmcloudgateway: Internal Cloud services returned an error: MCSTG00098I: Unable to reconcile /ibm/fs1 - probably not a space managed file system.

This typically happens if administrator has tried the mmcloudgateway account delete command before and has not restarted the service prior to invoking the migrate, reconcile, or any other similar commands. If the migration, reconcile, or any other Cloud services command fails with such a message, restart the Cloud services once by using the mmcloudgateway service restart {-N node-class} and retry the command.

Starting or stopping Transparent cloud tiering service fails with the Transparent cloud tiering seems to be in startup phase message

This is typically caused if the Gateway service is killed manually by using the kill command, without the graceful shutdown by using the mmcloudgateway service stop command.

Adding a cloud account to configure IBM Cloud Object Storage fails with the following error: 56: Cloud Account Validation failed. Invalid credential for Cloud Storage Provider. Details: Endpoint URL Validation Failed, invalid username or password.

Ensure that the appropriate user role is set through IBM Cloud® Object Storage dsNet Manager GUI.

HTTP Error 401 Unauthorized exception while you configure a cloud account

This issue happens when the time between the object storage server and the Gateway node is not synced up. Sync up the time with an NTP server and retry the operation.

Account creation command fails after a long wait and IBM Cloud Object Storage displays an error message saying that the vault cannot be created; but the vault is created

When you look at the IBM Cloud Object Storage manager UI, you see that the vault exists. This problem can occur if Transparent cloud tiering does not receive a successful return code from IBM Cloud Object Storage for the vault creation request.

The most common reason for this problem is that the threshold setting on the vault template is incorrect. If you have 6 IBM Cloud Object Storage slicestors and the write threshold is 6, then IBM Cloud Object Storage expects that all the slicestors are healthy. Check the IBM Cloud Object Storage manager UI. If any slicestors are in a warning or error state, update the threshold of the vault template.

Account creation command fails with error MCSTG00065E, but the data vault and the metadata vault exist

The full error message for this error is as follows:
MCSTG00065E: Command Failed with following reason: Error checking existence of, or creating, 
cloud container container_name or cloud metadata container container_name.meta.
But the data vault and the metadata vault are visible on the IBM Cloud Object Storage UI.
This error can occur if the metadata vault was created but its name index is disabled. To resolve this problem, do one of the follow actions:
  • Enter the command again with a new vault name and vault template.
  • Delete the vault on the IBM Cloud Object Storage UI and run the command again with the correct --metadata-location.
Note: It is a good practice to disable the name index of the data vault. The name index of the metadata vault must be enabled.

File or metadata transfer fails with koffLimitedRetryHandler:logError - Cannot retry after server error, command has exceeded retry limit, followed by RejectingHandler:exceptionCaught - Caught an exception com.ibm.gpfsconnector.messages.GpfsConnectorException: Unable to migrate

This is most likely caused by a network connectivity and/or bandwidth issue. Make sure that the network is functioning properly and retry the operation. For policy-initiated migrations, IBM Storage Scale policy scan might automatically retry the migration of the affected files on a subsequent run.

gpfs.snap: An Error was detected on node XYZ while invoking a request to collect the snap file for Transparent cloud tiering: (return code: 137).

If the gpfs.snap command fails with this error, increase the value of the timeout parameter by using the gpfs.snap --timeout Seconds option.

Note: If the Transparent cloud tiering log collection fails after the default timeout period expires, you can increase the timeout value and collect the TCT logs. The default timeout is 300 seconds (or 5 minutes).

Migration fails with error: MCSTG00008E: Unable to get fcntl lock on inode. Another MCStore request is running against this inode.

This happens because some other application might be having the file open, while Cloud services are trying to migrate it.

Connect: No route to host Cannot connect to the Transparent Cloud Tiering service. Please check that the service is running and that it is reachable over the network. Could not establish a connection to the MCStore server

During any data command, if this error is observed, it is due to abrupt shutdown of Cloud services on one of the nodes. This happens when Cloud services is not stopped on a node explicitly using the mmcloudgateway service stop command, but power of a node goes down or IBM Storage Scale daemon is taken down. This causes node IP address to be still considered as an active Cloud services node and, the data commands routed to it fail with this error.

"Generic_error" in the mmcloudgateway service status output

This error indicates that the cloud object storage is unreachable. Ensure that there is outbound connectivity to the cloud object storage. Logs indicate an exception about the failure.

An unexpected exception occurred during directory processing : Input/output error

You might encounter this error while migrating files to the cloud storage tier. To fix this, check the status of NSDs and ensure that the database/journal files are not corrupted and can be read from the file system.

It is marked for use by Transparent Cloud Tiering

You might encounter this error when you try to remove a Cloud services node from a cluster. To resolve this, use the --force option with the mmchnode command as follows:
mmchnode --cloud-gateway-disable -N nodename --cloud-gateway-nodeclass nodeclass --force

Container deletion fails

You might encounter this error when you try to remove a container pairset that is marked for Cloud services. To resolve this, use the --force option with the mmcloudgateway command as follows:
mmcloudgateway containerpairset delete --container-pair-set-name x13
--cloud-nodeclass cloud --force