April 14, 2023 By Daroush Renoit 6 min read

The blog post showcases how to migrate all buckets from one IBM Cloud Object Storage (COS) instance to another in the US region.

This is done by using a script in Python. The source IBM Cloud Object Storage (COS) instance will be the instance from which you are migrating, and the target/destination COS instance is the instance to which instance you will be migrating. This script uses ibm-cos and ibm-platform-services SDKs. An example of the architecture is below:

Prerequisites

  • Make sure you have at least two COS instances on the same IBM Cloud account
  • Install Python
  • Make sure you have the necessary permissions to do the following:
    • Create buckets
    • Modify buckets
    • Create IAM policy for COS instances
  • Install libraries for Python
    • ibm-cos-sdk for python: pip3 install ibm-cos-sdk
    • ibm-platform-services: pip3 install ibm-platform-services

Set environment variables for the script

The following are the environment variables that the scripts use:

  • IBMCLOUD_API_KEY=<ibmcloud_api_key>
  • SERVICE_INSTANCE_ID=<source_cos_instance_guid>
  • DEST_SERVICE_INSTANCE_ID=<target_cos_instance_guid>
  • US_GEO=<us_cos_endpoint>
  • IAM_POLICY_MANAGEMENT_URL=”https://iam.cloud.ibm.com”
  • IAM_POLICY_MANAGEMENT_AUTHTYPE=”iam”
  • IAM_POLICY_MANAGEMENT_APIKEY=<ibmcloud_api_key>
  • IAM_ACCOUNT_ID=<iam_account_id>
  • SUFFIX=<target_instance_suffix>
  • DISABLE_RULES=false

You can create and download your IBM Cloud API key in the IBM cloud console at Manage > Access (IAM) > API keys.

You can find GUID for the source and target instances in the cloud console resource list. Type in the name of each COS instance and click on the white part of the row of the instance to retrieve the GUID.

To find your US COS endpoint, click on your source COS instance from the Resource List in the navigation menu. Then, click on Endpoints and make sure the Selection Location dropdown says us-geo. Select the region that your buckets are in and make sure to prepend https:// in the environment variable.

Leave the values for the IAM_POLICY_MANAGEMENT_URL, IAM_POLICY_MANAGEMENT_AUTHTYPE and DISABLE_RULES as is.

The iam_account_id is the same value as your ibmcloud_api_key.

The suffix is used to append a name at the end of the newly created bucket since bucket names are globally unique.

Run the script

After the environment variables have been set, you may now run the script. You can find the code of the script below here.

import os
import ibm_boto3
from ibm_botocore.client import Config
from ibm_botocore.config import Config
from ibm_platform_services import IamPolicyManagementV1


# this is the suffix used for the new naming convention of buckets
suffix="-"+os.environ['SUFFIX']

iamAccountID=os.environ.get('IAM_ACCOUNT_ID')
# function to get region of a bucket
def getBucketRegion(locationConstraint):
    if locationConstraint == "us-smart" or locationConstraint == "us-standard" or locationConstraint == "us-vault" or locationConstraint == "us-cold":
        return "us-geo"
    if locationConstraint == "us-east-smart" or locationConstraint == "us-east-standard" or locationConstraint == "us-east-vault" or locationConstraint == "us-east-cold":
        return "us-east"
    if locationConstraint == "us-south-smart" or locationConstraint == "us-south-standard" or locationConstraint == "us-south-vault" or locationConstraint == "us-south-cold":
        return "us-south"
    return ""

# function to get region of the URL endpoint
def getUrlRegion():
    endpoint_url=os.environ['US_GEO']
    if endpoint_url=="https://s3.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us.cloud-object-storage.appdomain.cloud":
        return "us-geo"
    if endpoint_url=="https://s3.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.dal.us.cloud-object-storage.appdomain.cloud":
        return "dallas"
    if endpoint_url=="https://s3.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.wdc.us.cloud-object-storage.appdomain.cloud":
        return "washington"
    if endpoint_url=="https://s3.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.sjc.us.cloud-object-storage.appdomain.cloud":
        return "san jose"
    if endpoint_url=="https://s3.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-east.cloud-object-storage.appdomain.cloud":
        return "us-east"
    if endpoint_url=="https://s3.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-south.cloud-object-storage.appdomain.cloud":
        return "us-south"
    return ""

# function to list buckets
def get_buckets1(type,cos):
    bucketNames=[]
    try:
        buckets=cos.list_buckets()["Buckets"]
    except Exception as e:
        print("Error: Unable to get COS Buckets.",e)
    for bucket in buckets:
        try:
            request =cos.get_bucket_location(Bucket=bucket["Name"])
            bucketLocation=request["LocationConstraint"]
        except:
            #this except accounts for when the bucket is not in the targeted region
            bucketLocation=""

        if type == "target" and getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"])
        elif getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"]+suffix)
    return bucketNames
    


# function to create buckets
def create_buckets(targetBucketNames):
    # Destination cos client connection
    destCos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['DEST_SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    location =getUrlRegion()+"-smart"
    for bucketName in targetBucketNames:
        try:
            destCos.create_bucket(Bucket=bucketName,  CreateBucketConfiguration={
                'LocationConstraint': location
            })
            print("Created bucket:",bucketName)
        except Exception as e:
            print("ERROR: Unable to create bucket.",e)

def migrateBuckets():
    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    # Getting all source buckets 
    sourceBucketNames=get_buckets1("source",cos)
    print("All buckets from source instance from "+getUrlRegion()+" region:",sourceBucketNames)
    # Destination cos client connection
    destCos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['DEST_SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )

    # Getting all target buckets to avoid duplicates
    targetBucketNames=get_buckets1("target",destCos)
    print("All buckets from target instance from "+getUrlRegion()+" region:",targetBucketNames)
    # excluding buckets that already exists
    targetBucketNames=[x for x in sourceBucketNames if x not in targetBucketNames]
    print("All buckets from target instance without duplicates:",targetBucketNames)

    # creating buckets on target cos instance
    create_buckets(targetBucketNames)

# function to get region of a bucket
def getBucketRegion(locationConstraint):
    if locationConstraint == "us-smart" or locationConstraint == "us-standard" or locationConstraint == "us-vault" or locationConstraint == "us-cold":
        return "us-geo"
    if locationConstraint == "us-east-smart" or locationConstraint == "us-east-standard" or locationConstraint == "us-east-vault" or locationConstraint == "us-east-cold":
        return "us-east"
    if locationConstraint == "us-south-smart" or locationConstraint == "us-south-standard" or locationConstraint == "us-south-vault" or locationConstraint == "us-south-cold":
        return "us-south"
    return ""

# function to get region of the URL endpoint
def getUrlRegion():
    endpoint_url=os.environ['US_GEO']
    if endpoint_url=="https://s3.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us.cloud-object-storage.appdomain.cloud":
        return "us-geo"
    if endpoint_url=="https://s3.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.dal.us.cloud-object-storage.appdomain.cloud":
        return "dallas"
    if endpoint_url=="https://s3.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.wdc.us.cloud-object-storage.appdomain.cloud":
        return "washington"
    if endpoint_url=="https://s3.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.sjc.us.cloud-object-storage.appdomain.cloud":
        return "san jose"
    if endpoint_url=="https://s3.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-east.cloud-object-storage.appdomain.cloud":
        return "us-east"
    if endpoint_url=="https://s3.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-south.cloud-object-storage.appdomain.cloud":
        return "us-south"
    return ""


# function to list buckets
def get_buckets2(type,cos):
    bucketNames=[]
    try:
        buckets=cos.list_buckets()["Buckets"]
    except Exception as e:
        print("Error: Unable to get COS Buckets.",e)
    for bucket in buckets:
        try:
            request =cos.get_bucket_location(Bucket=bucket["Name"])
            bucketLocation=request["LocationConstraint"]
        except:
            #this except accounts for when the bucket is not in the targeted region
            bucketLocation=""
        if getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"])
 
    return bucketNames

#function to add replication rules to buckets
def addReplicationRules(buckets,targetID,cos):
    status='Enabled'
    if os.environ['DISABLE_RULES']=="true":
        status='Disabled'
    # this is the suffix used for the new naming convention of buckets
    suffix="-"+os.environ['SUFFIX']
    for bucket in buckets:
        try:
            cos.put_bucket_replication(Bucket=bucket,    ReplicationConfiguration={
                'Rules': [
                    {
                        
                        'Priority': 0,
                        'Status': status,
                        'Filter': {},
                        'Destination': {
                            'Bucket': 'crn:v1:bluemix:public:cloud-object-storage:global:a/'+iamAccountID+':'+targetID+':bucket:'+bucket+suffix,
                        },  'DeleteMarkerReplication': {
                            'Status': 'Enabled'
                        }
                    },
                ]
            })
            if os.environ['DISABLE_RULES']!="true":
                print("added replication rule to bucket",bucket)
            else:
                print("disabled replication rule to bucket",bucket)
        except Exception as e:
            print("Error: Unable to add replication rule to bucket",bucket,e)

# function to enable versioning on buckets
def enableVersioning(buckets,cos):
    for bucket in buckets:
        try:
            cos.put_bucket_versioning(
            Bucket=bucket,
            VersioningConfiguration={
        
                'Status': 'Enabled'
            },
            ExpectedBucketOwner='string'
    )
            print("versioning enable to bucket",bucket)
        except Exception as e:
            print("Error: Unable to enable versioning to bucket",bucket,e)

#function to create iam policy to for the source cos instance to write data to the target instance
def addAuthorization(sourceID,targetID):
    try:
        #Create IAM client
        service_client = IamPolicyManagementV1.new_instance()
        #service_client.create_policy(type="authorization",subjects=[policy_subjects],roles=[policy_roles],resources=[policy_resources])
        service_client.create_policy(type="authorization",subjects=[{"attributes":[{"name": "accountId","value":iamAccountID},{"name": "serviceName", "value": "cloud-object-storage"},{"name":"serviceInstance", "value":sourceID}]}],roles=[{"role_id": "crn:v1:bluemix:public:iam::::serviceRole:Writer"}],resources=[{"attributes":[{"name": "accountId","value":iamAccountID},{"name": "serviceName","value": "cloud-object-storage"},{"name":"serviceInstance", "value":targetID}]}])
        print("created authorization policy")
    except Exception as e:
        print("Warning: Unable to create policy. Please ignore if policy already exists",e)

def addReplicationRulesToMigratedBuckets():
    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    sourceCosInstanceID=os.environ['SERVICE_INSTANCE_ID']

    # Getting all source buckets 
    sourceBucketNames=get_buckets2("source",cos)


    #enable versioning for both cos instances
    print("enable versioning for source instances")
    enableVersioning(sourceBucketNames,cos)

    # Destination cos client connection
    destCos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['DEST_SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    targetCosInstanceId=os.environ['DEST_SERVICE_INSTANCE_ID']
    targetBucketNames = get_buckets2("target",destCos)
    print("enable versioning for target instances")
    enableVersioning(targetBucketNames,destCos)

    #add authorization from source cos instance to target cos instance
    addAuthorization(sourceCosInstanceID,targetCosInstanceId)

    #add replication rules to buckets
    addReplicationRules(sourceBucketNames,targetCosInstanceId,cos)



# function to get region of a bucket
def getBucketRegion(locationConstraint):
    if locationConstraint == "us-smart" or locationConstraint == "us-standard" or locationConstraint == "us-vault" or locationConstraint == "us-cold":
        return "us-geo"
    if locationConstraint == "us-east-smart" or locationConstraint == "us-east-standard" or locationConstraint == "us-east-vault" or locationConstraint == "us-east-cold":
        return "us-east"
    if locationConstraint == "us-south-smart" or locationConstraint == "us-south-standard" or locationConstraint == "us-south-vault" or locationConstraint == "us-south-cold":
        return "us-south"
    return ""

# function to get region of the URL endpoint
def getUrlRegion():
    endpoint_url=os.environ['US_GEO']
    if endpoint_url=="https://s3.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us.cloud-object-storage.appdomain.cloud":
        return "us-geo"
    if endpoint_url=="https://s3.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.dal.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.dal.us.cloud-object-storage.appdomain.cloud":
        return "dallas"
    if endpoint_url=="https://s3.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.wdc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.wdc.us.cloud-object-storage.appdomain.cloud":
        return "washington"
    if endpoint_url=="https://s3.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.sjc.us.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.sjc.us.cloud-object-storage.appdomain.cloud":
        return "san jose"
    if endpoint_url=="https://s3.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-east.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-east.cloud-object-storage.appdomain.cloud":
        return "us-east"
    if endpoint_url=="https://s3.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.private.us-south.cloud-object-storage.appdomain.cloud" or endpoint_url=="https://s3.direct.us-south.cloud-object-storage.appdomain.cloud":
        return "us-south"
    return ""



# function to list buckets
def get_buckets3(type,cos):

    bucketNames=[]
    try:
        buckets=cos.list_buckets()["Buckets"]
        #print(buckets)
    except Exception as e:
        print("Error: Unable to get COS Buckets.",e)
    for bucket in buckets:
        try:
            request =cos.get_bucket_location(Bucket=bucket["Name"])
            bucketLocation=request["LocationConstraint"]
        except:
            #this except accounts for when the bucket is not in the targeted region
            bucketLocation=""
        if getUrlRegion()==getBucketRegion(bucketLocation):
            bucketNames.append(bucket["Name"])

    return bucketNames


def copy_in_place(bucket):
    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )
    cosObjects=cos.list_objects(Bucket=bucket)
    if "Contents" not in cosObjects:
        print("source bucket is empty")
        return

    print("Priming existing objects in " + bucket + " for replication...")


    paginator = cos.get_paginator('list_objects_v2')
    pages = paginator.paginate(Bucket=bucket)

    
    for page in pages:
        for obj in page['Contents']:
            key = obj['Key']
            print("  * Copying " + key + " in place...")
            try:            
                headers = cos.head_object(
                    Bucket=bucket,
                    Key=key
                    )
                
                md = headers["Metadata"]
                
                cos.copy_object(
                    CopySource={
                        'Bucket': bucket,
                        'Key': key
                        },
                    Bucket=bucket,
                    Key=key,
                    TaggingDirective='COPY',
                    MetadataDirective='REPLACE',
                    Metadata=md
                    )
                print("    Success!")
            except Exception as e:
                print("    Unable to copy object: {0}".format(e))
    print("Existing objects in " + bucket + " are now subject to replication rules.")

def replicateExistingFiles():

    # Create client connection
    cos = ibm_boto3.client("s3",
                        ibm_api_key_id=os.environ.get('IBMCLOUD_API_KEY'),
                        ibm_service_instance_id=os.environ['SERVICE_INSTANCE_ID'],
                        config=Config(signature_version="oauth"),
                        endpoint_url=os.environ['US_GEO']
                        )

    # Getting all source buckets 
    sourceBucketNames=get_buckets3("source",cos)
    print("All source buckets to replicate",sourceBucketNames)

    # Copy data from source to target bucket
    for bucket in sourceBucketNames:
        copy_in_place(bucket)

# main
migrateBuckets()
addReplicationRulesToMigratedBuckets()
if os.environ['DISABLE_RULES']!="true":
    replicateExistingFiles()

COS instance migration script

This script was designed to help users migrate one COS instance to another instance on the same account for a US region. The function calls in the main function are executed in the following order.

  • migrateBuckets function: This function gathers all buckets from one source COS instance and creates them in the target COS instance. The newly created target bucket will have a suffix attached to it.
  • addReplicationRulesToMigratedBuckets function: The function enables replication rules to the source buckets so it can write data to the target buckets when data is added or removed after the rule is applied. This is achieved by enabling versioning on both source and target buckets. Versioning is required to enable replication. Versioning is a history of all files in a bucket. The script also creates an IAM policy on the entire source and destination instance to allow source buckets to write to their respective target buckets. Make sure DISABLE_RULES to false.
  • replicateExistingFiles function: I previously mentioned that replication applies to a bucket when newly adding or deleting files after the rule has been set. If you want to transfer files that existed before the rule was applied, make sure DISABLE_RULES to false to activate this function.

Disable replication rules

If you want to disable the replication rules for the buckets, set DISABLE_RULES to true and run the script again.

Conclusion

By following these steps, you will successfully migrate buckets from one US IBM Cloud Object Storage (COS) instance to another per region.

If you have any questions, you can reach out to me on LinkedIn.

Was this article helpful?
YesNo

More from Cloud

A clear path to value: Overcome challenges on your FinOps journey 

3 min read - In recent years, cloud adoption services have accelerated, with companies increasingly moving from traditional on-premises hosting to public cloud solutions. However, the rise of hybrid and multi-cloud patterns has led to challenges in optimizing value and controlling cloud expenditure, resulting in a shift from capital to operational expenses.   According to a Gartner report, cloud operational expenses are expected to surpass traditional IT spending, reflecting the ongoing transformation in expenditure patterns by 2025. FinOps is an evolving cloud financial management discipline…

IBM Power8 end of service: What are my options?

3 min read - IBM Power8® generation of IBM Power Systems was introduced ten years ago and it is now time to retire that generation. The end-of-service (EoS) support for the entire IBM Power8 server line is scheduled for this year, commencing in March 2024 and concluding in October 2024. EoS dates vary by model: 31 March 2024: maintenance expires for Power Systems S812LC, S822, S822L, 822LC, 824 and 824L. 31 May 2024: maintenance expires for Power Systems S812L, S814 and 822LC. 31 October…

24 IBM offerings winning TrustRadius 2024 Top Rated Awards

2 min read - TrustRadius is a buyer intelligence platform for business technology. Comprehensive product information, in-depth customer insights and peer conversations enable buyers to make confident decisions. “Earning a Top Rated Award means the vendor has excellent customer satisfaction and proven credibility. It’s based entirely on reviews and customer sentiment,” said Becky Susko, TrustRadius, Marketing Program Manager of Awards. Top Rated Awards have to be earned: Gain 10+ new reviews in the past 12 months Earn a trScore of 7.5 or higher from…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters