Migrating project and space assets (Watson Studio)

Use cpd-cli commands to export and import project and space assets within a single Cloud Pak for Data cluster or between different clusters.

This command uses the project and space export functionality.

Attention:

Importing assets of one source container to multiple target containers is not permitted, while importing multiple source containers to one target container is allowed. This is to ensure that cross-container relationships are handled properly.

Prerequisites

Required roles

To complete this task, you must have one of the following roles on Red Hat OpenShift:

  • Cluster administrator
  • Instance administrator

In addition, you must have one of the following permissions in Cloud Pak for Data:

  • A user with Platform administration permission

Supported for: The migration of project and space assets is supported for these Cloud Pak for Data versions:

  • Export assets from Cloud Pak for Data
  • Import assets to Cloud Pak for Data

Before you begin:

  • Create an installation environment variables file and source the environment variables before you run the commands in this task. See Setting up environment variables.

  • Make sure that the prerequisite tasks are completed and the export and import utility is initialized. See Migrating data between CPD installations.

  • Check whether the catalog-api-aux auxiliary module is available. Run the following command to determine which auxiliary modules are installed:

    cpd-cli export-import list aux-modules \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --arch=${CPU_ARCH}
    
  • Review the restrictions for project export available at Requirements and restrictions in the Cloud Pak for Data documentation.

Procedure

Exporting project and space assets

To export project and space assets from the Watson Studio service, complete the following steps:

  1. Log in to Red Hat OpenShift as a user with sufficient permissions to complete the task.

    ${OC_LOGIN}
    
    Remember: The ${OC_LOGIN} variable is an alias for the oc login command. For more information, see Setting up environment variables.
  2. Set your profile based on the instance of Cloud Pak for Data that you want to export data from:

    cpd-cli config profiles set ${CPD_PROFILE_NAME} \
    --user ${LOCAL_USER} \
    --url ${CPD_PROFILE_NAME}
    
  3. Determine the scope of the export. If you want to export all assets from all projects and spaces, go to step 3. If you want to select which assets to export, create a YAML file with the export specification. Create an export.yaml file with the following structure:

    catalog-api-aux:
      # exportspec specifies which assets to export
      exportspec: '<EXPORT_JSON_STRING>'
    

    Specify which assets to export by replacing the <EXPORT_JSON_STRING> variable with a JSON string.

    • Export all assets from the list of specified projects or spaces:
    '{
      "project": { <<---- this is the type of asset container: project or space
        "container_specs": [
          {
            "guids": [
              "00cf5b102-17b3-4638-95df-309a2a443137",  <<---- these are the ids of asset containers
              "e14a187d-4fc8-4a5d-aee1-e2ad4ec60b29"
            ],
            "all_assets": true
          }
        ]
      },
      "catalog": { <<---- this is the type of asset container: project or space
        "container_specs": [
          {
            "guids": [
              "ab3d18a4-015f-4bb1-ae13-4ee9faf0382a"  <<---- these are the ids of asset containers
            ],
            "all_assets": true
          }
        ]
      }
    }'
    
    • Export specific assets from the list of specified projects or spaces:
    '{
      "space": { <<---- this is the type of asset container: project or space
        "container_specs": [
          {
            "guids": [
              "4b014869-fc0e-4919-bc93-3fa8c40bcd37"  <<---- this is the id of asset container
            ],
            "asset_specs": {
              "asset_ids": [
                "e14a187d-4fc8-4a5d-aee1-e2ad4ec60b29",  <<---- these are the ids of assets
                "e2395865-5503-49da-b708-cfbf2fb1b646"
              ]
            }
          }
        ]
      }
    }'
    
    • Export assets of specific types in from the list of specified projects or spaces:
    '{
      "space": { <<---- this is the type of asset container: project or space
        "container_specs": [
          {
            "guids": [
              "4b014869-fc0e-4919-bc93-3fa8c40bcd37",  <<---- these are the ids of asset containers
              "866613f8-4102-4cd2-be6e-ee46ea6c468b"
            ],
            "asset_specs": {
              "asset_types": [
                "data_asset",  <<---- these are the types of assets
                "notebook"
              ]
            }
          }
        ]
      }
    }'
    
  4. Change to the directory where the cpd-cli command-line interface is installed.

  5. Set the following environment variables:

    1. Set the EXPORT_NAME environment variable to the name that you want to use to identify the export job:
    export EXPORT_NAME=<name>
    
    1. If you want to export specific assets, set the EXPORT_YAML_FILE_LOCATION environment variable to the fully qualified path of the YAML file that contains the export specification:
    export EXPORT_YAML_FILE_LOCATION=<fully-qualified-path-to-YAML-file>
    
  6. Run the appropriate export command for your environment:

    • To export all of the assets, run:
    cpd-cli export-import export create ${EXPORT_NAME} \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --component=catalog-api \
    --arch=${CPU_ARCH}
    
    • To export specific assets, run:
    cpd-cli export-import export create ${EXPORT_NAME} \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --component=catalog-api \
    --values=${EXPORT_YAML_FILE_LOCATION} \
    --arch=${CPU_ARCH}
    
  7. Wait for export to complete. To check the export status, run:

    cpd-cli export-import export status ${EXPORT_NAME} \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --arch=${CPU_ARCH}
    
  8. If you plan to import the data to a different cluster, download the result of the export:

    cpd-cli export-import export download ${EXPORT_NAME} \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --arch=${CPU_ARCH}
    

    The name of the exported package has the format: cpd-exports-<export_name>-<timestamp>-data.tar

Importing project and space assets

To import assets to the Watson Studio service, complete the following steps:

To import assets to the Watson Studio service, complete the following steps:

  1. Follow the appropriate guidance based on where you want to import the data:

    • If you are importing the data to a Watson Studio service instance on a different instance of Cloud Pak for Data on the same cluster:
    1. Ensure that you source the correct environment variables file.
    2. Set your profile based on the instance of Cloud Pak for Data that you want to import data to:
      cpd-cli config profiles set ${CPD_PROFILE_NAME} \
      --user ${LOCAL_USER} \
      --url ${CPD_PROFILE_NAME}
      
    • If you are importing the data to a Watson Studio service instance on a different cluster:
    1. Ensure that you source the correct environment variables file.

    2. Log in to Red Hat OpenShift Container Platform that you want to import data to, and ensure you have sufficient permissions to complete the task.

      ${OC_LOGIN}
      
      Remember: The ${OC_LOGIN} variable is an alias for the oc login command. For more information, see Setting up environment variables.
    3. Set your profile based on the Cloud Pak for Data instance:

      cpd-cli config profiles set ${CPD_PROFILE_NAME} \
      --user ${LOCAL_USER} \
      --url ${CPD_PROFILE_NAME}
      
  2. Determine the scope of the import. If you want to import all assets from all projects and spaces, go to step 4. If you want to select which assets to import, create a YAML file with the import specification. Create an import.yaml file with the following structure:

    catalog-api-aux:
      admin_username: cpadmin
      # importspec specifies which assets to import
      importspec: '<IMPORT_JSON_STRING>'
    

    Replace the <IMPORT_JSON_STRING> variable with a JSON string that specifies assets to import.

    The admin_username that you specify will be the owner of the imported assets, and must be set to the instance administrator.

    Add the following import specifications to your YAML file to select which assets to import:

    • containerIds: import specific source assets into specific existing containers in the new cluster. The value is a JSON map of source: target.

    Example:

    '{ "containerIds": { "default": "<target-id1>", "<original-id>": "<target-id2>" } }'
    

    If you are navigating the user interface, you can find the containerIDs in the URL of the project.

    You can also use the cpd-cli command line tool and use the following commands:

    • cpd-cli project list which returns a list of projects with their IDs
    • cpd-cli asset search which will return a list of assets with their IDs

    For more information about the cpd-cli command line tool, see cpd-cli command reference.

    • newContainerNameOverrides: import containers into new containers. You can give specific names to containers before they have been created. Otherwise, new containers will be given default names. The value is a JSON map of source: target.

      Example:

      { "newContainerNameOverrides": { "<source-id1>": "<target-name1>", "<source-id2>": "<target-name2>" } }
      
    • container_suffix: import containers into new containers. You can give a suffix to the container names before they are created. Otherwise, new containers are given default names.

      Example:

      { "container_suffix": "-migration" }
      
    • duplicate_action: specify which action to take during asset creation. Choose one of 4 values: "UPDATE", "REJECT", "REPLACE", or "IGNORE".

      Note:

      duplicate_action is not supported for the connection asset type.

      Example:

      { "duplicate_action": "REPLACE" }
      
  3. Change to the directory where the cpd-cli command-line interface is installed.

  4. Set the IMPORT_NAME environment variable to the name that you want to use to identify the import job:

    export IMPORT_NAME=<name>
    
  5. Run the following command to import the data to the Watson Studio service:

    cpd-cli export-import import create ${IMPORT_NAME} \
    --from-export=${EXPORT_NAME} \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --values=${IMPORT_YAML_FILE_LOCATION} \
    --arch=${CPU_ARCH} \
    --backoff-limit=0
    
  6. Wait for import to complete. To check the import status, run:

    cpd-cli export-import import status ${IMPORT_NAME} \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --arch=${CPU_ARCH}