The blog post showcases how to migrate all buckets from one IBM Cloud Object Storage (COS) instance to another in the US region.

This is done by using a script in Python. The source IBM Cloud Object Storage (COS) instance will be the instance from which you are migrating, and the target/destination COS instance is the instance to which instance you will be migrating. This script uses ibm-cos and ibm-platform-services SDKs. An example of the architecture is below:

Prerequisites

  • Make sure you have at least two COS instances on the same IBM Cloud account
  • Install Python
  • Make sure you have the necessary permissions to do the following:
    • Create buckets
    • Modify buckets
    • Create IAM policy for COS instances
  • Install libraries for Python
    • ibm-cos-sdk for python: pip3 install ibm-cos-sdk
    • ibm-platform-services: pip3 install ibm-platform-services

Set environment variables for the script

The following are the environment variables that the scripts use:

  • IBMCLOUD_API_KEY=<ibmcloud_api_key>
  • SERVICE_INSTANCE_ID=<source_cos_instance_guid>
  • DEST_SERVICE_INSTANCE_ID=<target_cos_instance_guid>
  • US_GEO=<us_cos_endpoint>
  • IAM_POLICY_MANAGEMENT_URL=”https://iam.cloud.ibm.com”
  • IAM_POLICY_MANAGEMENT_AUTHTYPE=”iam”
  • IAM_POLICY_MANAGEMENT_APIKEY=<ibmcloud_api_key>
  • IAM_ACCOUNT_ID=<iam_account_id>
  • SUFFIX=<target_instance_suffix>
  • DISABLE_RULES=false

You can create and download your IBM Cloud API key in the IBM cloud console at Manage > Access (IAM) > API keys.

You can find GUID for the source and target instances in the cloud console resource list. Type in the name of each COS instance and click on the white part of the row of the instance to retrieve the GUID.

To find your US COS endpoint, click on your source COS instance from the Resource List in the navigation menu. Then, click on Endpoints and make sure the Selection Location dropdown says us-geo. Select the region that your buckets are in and make sure to prepend https:// in the environment variable.

Leave the values for the IAM_POLICY_MANAGEMENT_URL, IAM_POLICY_MANAGEMENT_AUTHTYPE and DISABLE_RULES as is.

The iam_account_id is the same value as your ibmcloud_api_key.

The suffix is used to append a name at the end of the newly created bucket since bucket names are globally unique.

Run the script

After the environment variables have been set, you may now run the script. You can find the code of the script below here.

import os
import ibm_boto3
from ibm_botocore.client import Config
from ibm_botocore.config import Config
from ibm_platform_services import IamPolicyManagementV1


# this is the suffix used for the new naming convention of buckets
suffix="-"+os.environ['SUFFIX']

iamAccountID=os.environ.get('IAM_ACCOUNT_ID')
# function to get region of a bucket
def getBucketRegion(locationConstraint):
    if locationConstraint == "us-smart" or locationConstraint == "us-standard" or locationConstraint == "us-vault" or locationConstraint == "us-cold":
        return "us-geo"
    if locationConstraint == "us-east-smart" or locationConstraint == "us-east-standard" or locationConstraint == "us-east-vault" or locationConstraint == "us-east-cold":
        return "us-east"
    if locationConstraint == "us-south-smart" or locationConstraint == "us-south-standard" or locationConstraint == "us-south-vault" or locationConstraint == "us-south-cold":
        return "us-south"
    return ""

# function to get region of the URL endpoint
def getUrlRegion():
    endpoint_url=os.environ['US_GEO']
    if endpoint_url=="https://s3.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us.cloud-object-storage.appdomain.cloud":
        return "us-geo"
    if endpoint_url=="https://s3.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.dal.us.cloud-object-storage.appdomain.cloud":
        return "dallas"
    if endpoint_url=="https://s3.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.wdc.us.cloud-object-storage.appdomain.cloud":
        return "washington"
    if endpoint_url=="https://s3.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.sjc.us.cloud-object-storage.appdomain.cloud":
        return "san jose"
    if endpoint_url=="https://s3.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-east.cloud-object-storage.appdomain.cloud":
        return "us-east"
    if endpoint_url=="https://s3.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-south.cloud-object-storage.appdomain.cloud":
        return "us-south"
    return ""

# function to list buckets
def get_buckets1(type,cos):
    bucketNames=[]
    try:
        buckets=cos.list_buckets()["Buckets"]
    except Exception as e:
        print("Error: Unable to get COS Buckets.",e)
    for bucket in buckets:
        try:
            request =cos.get_bucket_location(Bucket=bucket["Name"])
            bucketLocation=request["LocationConstraint"]
        except:
            #this except accounts for when the bucket is not in the targeted region
            bucketLocation=""

        if type == "target" and getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"])
        elif getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"]+suffix)
    return bucketNames
    


# function to create buckets
def create_buckets(targetBucketNames):
    # Destination cos client connection
    destCos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['DEST_SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    location =getUrlRegion()+"-smart"
    for bucketName in targetBucketNames:
        try:
            destCos.create_bucket(Bucket=bucketName,  CreateBucketConfiguration={
                'LocationConstraint': location
            })
            print("Created bucket:",bucketName)
        except Exception as e:
            print("ERROR: Unable to create bucket.",e)

def migrateBuckets():
    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    # Getting all source buckets 
    sourceBucketNames=get_buckets1("source",cos)
    print("All buckets from source instance from "+getUrlRegion()+" region:",sourceBucketNames)
    # Destination cos client connection
    destCos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['DEST_SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )

    # Getting all target buckets to avoid duplicates
    targetBucketNames=get_buckets1("target",destCos)
    print("All buckets from target instance from "+getUrlRegion()+" region:",targetBucketNames)
    # excluding buckets that already exists
    targetBucketNames=[x for x in sourceBucketNames if x not in targetBucketNames]
    print("All buckets from target instance without duplicates:",targetBucketNames)

    # creating buckets on target cos instance
    create_buckets(targetBucketNames)

# function to get region of a bucket
def getBucketRegion(locationConstraint):
    if locationConstraint == "us-smart" or locationConstraint == "us-standard" or locationConstraint == "us-vault" or locationConstraint == "us-cold":
        return "us-geo"
    if locationConstraint == "us-east-smart" or locationConstraint == "us-east-standard" or locationConstraint == "us-east-vault" or locationConstraint == "us-east-cold":
        return "us-east"
    if locationConstraint == "us-south-smart" or locationConstraint == "us-south-standard" or locationConstraint == "us-south-vault" or locationConstraint == "us-south-cold":
        return "us-south"
    return ""

# function to get region of the URL endpoint
def getUrlRegion():
    endpoint_url=os.environ['US_GEO']
    if endpoint_url=="https://s3.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us.cloud-object-storage.appdomain.cloud":
        return "us-geo"
    if endpoint_url=="https://s3.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.dal.us.cloud-object-storage.appdomain.cloud":
        return "dallas"
    if endpoint_url=="https://s3.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.wdc.us.cloud-object-storage.appdomain.cloud":
        return "washington"
    if endpoint_url=="https://s3.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.sjc.us.cloud-object-storage.appdomain.cloud":
        return "san jose"
    if endpoint_url=="https://s3.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-east.cloud-object-storage.appdomain.cloud":
        return "us-east"
    if endpoint_url=="https://s3.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-south.cloud-object-storage.appdomain.cloud":
        return "us-south"
    return ""


# function to list buckets
def get_buckets2(type,cos):
    bucketNames=[]
    try:
        buckets=cos.list_buckets()["Buckets"]
    except Exception as e:
        print("Error: Unable to get COS Buckets.",e)
    for bucket in buckets:
        try:
            request =cos.get_bucket_location(Bucket=bucket["Name"])
            bucketLocation=request["LocationConstraint"]
        except:
            #this except accounts for when the bucket is not in the targeted region
            bucketLocation=""
        if getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"])
 
    return bucketNames

#function to add replication rules to buckets
def addReplicationRules(buckets,targetID,cos):
    status='Enabled'
    if os.environ['DISABLE_RULES']=="true":
        status='Disabled'
    # this is the suffix used for the new naming convention of buckets
    suffix="-"+os.environ['SUFFIX']
    for bucket in buckets:
        try:
            cos.put_bucket_replication(Bucket=bucket,    ReplicationConfiguration={
                'Rules': [
                    {
                        
                        'Priority': 0,
                        'Status': status,
                        'Filter': {},
                        'Destination': {
                            'Bucket': 'crn:v1:bluemix:public:cloud-object-storage:global:a/'+iamAccountID+':'+targetID+':bucket:'+bucket+suffix,
                        },  'DeleteMarkerReplication': {
                            'Status': 'Enabled'
                        }
                    },
                ]
            })
            if os.environ['DISABLE_RULES']!="true":
                print("added replication rule to bucket",bucket)
            else:
                print("disabled replication rule to bucket",bucket)
        except Exception as e:
            print("Error: Unable to add replication rule to bucket",bucket,e)

# function to enable versioning on buckets
def enableVersioning(buckets,cos):
    for bucket in buckets:
        try:
            cos.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
        
                'Status': 'Enabled'
            },
            ExpectedBucketOwner='string'
    )
            print("versioning enable to bucket",bucket)
        except Exception as e:
            print("Error: Unable to enable versioning to bucket",bucket,e)

#function to create iam policy to for the source cos instance to write data to the target instance
def addAuthorization(sourceID,targetID):
    try:
        #Create IAM client
        service_client = IamPolicyManagementV1.new_instance()
        #service_client.create_policy(type="authorization",subjects=[policy_subjects],roles=[policy_roles],resources=[policy_resources])
        service_client.create_policy(type="authorization",subjects=[{"attributes":[{"name": "accountId","value":iamAccountID},{"name": "serviceName", "value": "cloud-object-storage"},{"name":"serviceInstance", "value":sourceID}]}],roles=[{"role_id": "crn:v1:bluemix:public:iam::::serviceRole:Writer"}],resources=[{"attributes":[{"name": "accountId","value":iamAccountID},{"name": "serviceName","value": "cloud-object-storage"},{"name":"serviceInstance", "value":targetID}]}])
        print("created authorization policy")
    except Exception as e:
        print("Warning: Unable to create policy. Please ignore if policy already exists",e)

def addReplicationRulesToMigratedBuckets():
    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    sourceCosInstanceID=os.environ['SERVICE_INSTANCE_ID']

    # Getting all source buckets 
    sourceBucketNames=get_buckets2("source",cos)


    #enable versioning for both cos instances
    print("enable versioning for source instances")
    enableVersioning(sourceBucketNames,cos)

    # Destination cos client connection
    destCos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['DEST_SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    targetCosInstanceId=os.environ['DEST_SERVICE_INSTANCE_ID']
    targetBucketNames = get_buckets2("target",destCos)
    print("enable versioning for target instances")
    enableVersioning(targetBucketNames,destCos)

    #add authorization from source cos instance to target cos instance
    addAuthorization(sourceCosInstanceID,targetCosInstanceId)

    #add replication rules to buckets
    addReplicationRules(sourceBucketNames,targetCosInstanceId,cos)



# function to get region of a bucket
def getBucketRegion(locationConstraint):
    if locationConstraint == "us-smart" or locationConstraint == "us-standard" or locationConstraint == "us-vault" or locationConstraint == "us-cold":
        return "us-geo"
    if locationConstraint == "us-east-smart" or locationConstraint == "us-east-standard" or locationConstraint == "us-east-vault" or locationConstraint == "us-east-cold":
        return "us-east"
    if locationConstraint == "us-south-smart" or locationConstraint == "us-south-standard" or locationConstraint == "us-south-vault" or locationConstraint == "us-south-cold":
        return "us-south"
    return ""

# function to get region of the URL endpoint
def getUrlRegion():
    endpoint_url=os.environ['US_GEO']
    if endpoint_url=="https://s3.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us.cloud-object-storage.appdomain.cloud":
        return "us-geo"
    if endpoint_url=="https://s3.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.dal.us.cloud-object-storage.appdomain.cloud":
        return "dallas"
    if endpoint_url=="https://s3.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.wdc.us.cloud-object-storage.appdomain.cloud":
        return "washington"
    if endpoint_url=="https://s3.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.sjc.us.cloud-object-storage.appdomain.cloud":
        return "san jose"
    if endpoint_url=="https://s3.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-east.cloud-object-storage.appdomain.cloud":
        return "us-east"
    if endpoint_url=="https://s3.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-south.cloud-object-storage.appdomain.cloud":
        return "us-south"
    return ""



# function to list buckets
def get_buckets3(type,cos):

    bucketNames=[]
    try:
        buckets=cos.list_buckets()["Buckets"]
        #print(buckets)
    except Exception as e:
        print("Error: Unable to get COS Buckets.",e)
    for bucket in buckets:
        try:
            request =cos.get_bucket_location(Bucket=bucket["Name"])
            bucketLocation=request["LocationConstraint"]
        except:
            #this except accounts for when the bucket is not in the targeted region
            bucketLocation=""
        if getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"])

    return bucketNames


def copy_in_place(bucket):
    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    cosObjects=cos.list_objects(Bucket=bucket)
    if "Contents" not in cosObjects:
        print("source bucket is empty")
        return

    print("Priming existing objects in " + bucket + " for replication...")


    paginator = cos.get_paginator('list_objects_v2')
    pages = paginator.paginate(Bucket=bucket)

    
    for page in pages:
        for obj in page['Contents']:
            key = obj['Key']
            print("  * Copying " + key + " in place...")
            try:            
                headers = cos.head_object(
                    Bucket=bucket,
                    Key=key
                    )
                
                md = headers["Metadata"]
                
                cos.copy_object(
                    CopySource={
                        'Bucket': bucket,
                        'Key': key
                        },
                    Bucket=bucket,
                    Key=key,
                    TaggingDirective='COPY',
                    MetadataDirective='REPLACE',
                    Metadata=md
                    )
                print("    Success!")
            except Exception as e:
                print("    Unable to copy object: {0}".format(e))
    print("Existing objects in " + bucket + " are now subject to replication rules.")

def replicateExistingFiles():

    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )

    # Getting all source buckets 
    sourceBucketNames=get_buckets3("source",cos)
    print("All source buckets to replicate",sourceBucketNames)

    # Copy data from source to target bucket
    for bucket in sourceBucketNames:
        copy_in_place(bucket)

# main
migrateBuckets()
addReplicationRulesToMigratedBuckets()
if os.environ['DISABLE_RULES']!="true":
    replicateExistingFiles()

COS instance migration script

This script was designed to help users migrate one COS instance to another instance on the same account for a US region. The function calls in the main function are executed in the following order.

  • migrateBuckets function: This function gathers all buckets from one source COS instance and creates them in the target COS instance. The newly created target bucket will have a suffix attached to it.
  • addReplicationRulesToMigratedBuckets function: The function enables replication rules to the source buckets so it can write data to the target buckets when data is added or removed after the rule is applied. This is achieved by enabling versioning on both source and target buckets. Versioning is required to enable replication. Versioning is a history of all files in a bucket. The script also creates an IAM policy on the entire source and destination instance to allow source buckets to write to their respective target buckets. Make sure DISABLE_RULES to false.
  • replicateExistingFiles function: I previously mentioned that replication applies to a bucket when newly adding or deleting files after the rule has been set. If you want to transfer files that existed before the rule was applied, make sure DISABLE_RULES to false to activate this function.

Disable replication rules

If you want to disable the replication rules for the buckets, set DISABLE_RULES to true and run the script again.

Conclusion

By following these steps, you will successfully migrate buckets from one US IBM Cloud Object Storage (COS) instance to another per region.

If you have any questions, you can reach out to me on LinkedIn.

Categories

More from Cloud

Kubernetes version 1.28 now available in IBM Cloud Kubernetes Service

2 min read - We are excited to announce the availability of Kubernetes version 1.28 for your clusters that are running in IBM Cloud Kubernetes Service. This is our 23rd release of Kubernetes. With our Kubernetes service, you can easily upgrade your clusters without the need for deep Kubernetes knowledge. When you deploy new clusters, the default Kubernetes version remains 1.27 (soon to be 1.28); you can also choose to immediately deploy version 1.28. Learn more about deploying clusters here. Kubernetes version 1.28 In…

Temenos brings innovative payments capabilities to IBM Cloud to help banks transform

3 min read - The payments ecosystem is at an inflection point for transformation, and we believe now is the time for change. As banks look to modernize their payments journeys, Temenos Payments Hub has become the first dedicated payments solution to deliver innovative payments capabilities on the IBM Cloud for Financial Services®—an industry-specific platform designed to accelerate financial institutions' digital transformations with security at the forefront. This is the latest initiative in our long history together helping clients transform. With the Temenos Payments…

Foundational models at the edge

7 min read - Foundational models (FMs) are marking the beginning of a new era in machine learning (ML) and artificial intelligence (AI), which is leading to faster development of AI that can be adapted to a wide range of downstream tasks and fine-tuned for an array of applications.  With the increasing importance of processing data where work is being performed, serving AI models at the enterprise edge enables near-real-time predictions, while abiding by data sovereignty and privacy requirements. By combining the IBM watsonx data…

The next wave of payments modernization: Minimizing complexity to elevate customer experience

3 min read - The payments ecosystem is at an inflection point for transformation, especially as we see the rise of disruptive digital entrants who are introducing new payment methods, such as cryptocurrency and central bank digital currencies (CDBC). With more choices for customers, capturing share of wallet is becoming more competitive for traditional banks. This is just one of many examples that show how the payments space has evolved. At the same time, we are increasingly seeing regulators more closely monitor the industry’s…