Migrating catalog assets (IBM Knowledge Catalog)

Use cpd-cli commands to export and import catalog assets within a single Cloud Pak for Data cluster or between different clusters.

Prerequisites

Required roles
To complete this task, you must have one of the following roles:
  • Cluster administrator
  • Instance administrator
Limitation
  • Catalog access control and settings cannot be exported or imported.
  • Assets notes cannot be migrated.
  • Assets reviews and ratings cannot be migrated.

Before you begin

Before you can use the export import utility, complete the following tasks:

  1. Verify that you created a cpd-cli profile on your workstation. For more information, see Creating a profile to use the cpd-cli management commands.

  2. Verify that the required persistent volume claim (PVC) exists in the operands project for the instance oc get pvc export-import-pvc --namespace=${PROJECT_CPD_INST_OPERANDS}. If the PVC doesn't exist, see Preparing to use the Cloud Pak for Data export and import utility.

  3. Initialize the export import utility on your workstation. For more information, see Initializing the export import utility.

  4. Verify that the catalog-api-aux auxiliary module is available by running the following command:

    cpd-cli export-import list aux-modules \
    --namespace=${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME} \
    --arch=${CPU_ARCH}
    

Supported asset types

You can export and import the following catalog assets:

  • connection
  • data_asset
  • BI reports
  • Logical data models (LDMs)
  • Physical data models (PDMs)
  • Custom asset types
Warning:

Migrating data source definitions is currently not supported. For more information, see Known issues for IBM Knowledge Catalog.

Selecting the scope of the export

If you want to export all assets from all catalogs, projects, and spaces, go to Exporting catalog assets.

If you want to select which assets to export, use the following steps:

  1. Define the name of the YAML file:

    export OVERRIDE_YAML_FILE=<fully-qualified-file-name>
    
  2. Create an export.yaml file with the following structure:

    catalog-api-aux:
      # exportspec specifies which assets to export
       exportspec: 'EXPORT_JSON_STRING'
    
  3. Specify which assets to export by replacing the EXPORT_JSON_STRING variable with a JSON string. If you want to export all assets from a certain workspace, your string would resemble this example:

    catalog-api-aux:
       exportspec: '{"catalog": {    "container_specs": [{    "guids": ["08f248d3-0c5f-4193-a16d-e3c0cc7f7d94"],    "all_assets": true}  ]} }'
    
    Attention: The JSON string must be placed in one line and wrapped with single quotation mark characters. For clarity, the following examples show JSON strings on separate lines.

    If you want to export certain assets, use the code in the following examples:

    • Export all assets from the list of specified catalogs, projects, or spaces:

      '{
        "project": { <<---- this is the type of asset container: catalog, project, or space
          "container_specs": [
            {
              "guids": [
                "00cf5b102-17b3-4638-95df-309a2a443137",  <<---- these are the ids of asset containers
                "e14a187d-4fc8-4a5d-aee1-e2ad4ec60b29"
              ],
              "all_assets": true
            }
          ]
        },
        "catalog": { <<---- this is the type of asset container: catalog, project, or space
          "container_specs": [
            {
              "guids": [
                "ab3d18a4-015f-4bb1-ae13-4ee9faf0382a"  <<---- these are the ids of asset containers
              ],
              "all_assets": true
            }
          ]
        }
      }'
      
    • Export specific assets from the list of specified catalogs, projects, or spaces:

      '{
        "space": { <<---- this is the type of asset container: catalog, project, or space
          "container_specs": [
            {
              "guids": [
                "4b014869-fc0e-4919-bc93-3fa8c40bcd37"  <<---- this is the id of asset container
              ],
              "asset_specs": {
                "asset_ids": [
                  "e14a187d-4fc8-4a5d-aee1-e2ad4ec60b29",  <<---- these are the ids of assets
                  "e2395865-5503-49da-b708-cfbf2fb1b646"
                ]
              }
            }
          ]
        }
      }'
      
    • Export assets of specific types in from the list of specified catalogs, projects, or spaces:

      '{
        "space": { <<---- this is the type of asset container: catalog, project, or space
          "container_specs": [
            {
              "guids": [
                "4b014869-fc0e-4919-bc93-3fa8c40bcd37",  <<---- these are the ids of asset containers
                "866613f8-4102-4cd2-be6e-ee46ea6c468b"
              ],
              "asset_specs": {
                "asset_types": [
                  "data_asset",  <<---- these are the types of assets
                  "notebook"
                ]
              }
            }
          ]
        }
      }'
      

Exporting catalog assets

To export catalog assets from the IBM Knowledge Catalog service, complete the following tasks:

  1. Define the name of the export task:

    export EXPORT_NAME=<name>
    
  2. Go to the directory where the cpd-cli command-line interface is installed and run the appropriate export command for your scope:

    • If you want to export all assets from all catalogs, projects, and spaces, use the following command:
      cpd-cli export-import export create ${EXPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
        --component catalog-api \
        --profile=${CPD_PROFILE_NAME}
      
    • If you want to select which assets to export, reference the YAML file created in Selecting the scope of the export for the following command:
      cpd-cli export-import export create ${EXPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
        --component catalog-api \
        --values ${OVERRIDE_YAML_FILE} \
        --profile=${CPD_PROFILE_NAME}
      
  3. Wait for the export to complete. You can check the export status by using this command:

    cpd-cli export-import export status ${EXPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME}
    
  4. Download the export result:

    cpd-cli export-import export download ${EXPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME}
    

    The name of the exported package has this format: cpd-exports-<export_name>-<timestamp>-data.tar

  5. Set an environment variable to define the export package name:

    export WKC_ASSET_EXPORT=<export_file_name>
    

Selecting the scope of the import

If you want to import all assets from all catalogs, projects, and spaces, go to step Importing catalog assets.

If you want to select which assets to import, create a YAML file with the import specification, use the following steps:

  1. Create an import.yaml file with the following structure:

    catalog-api-aux:
      # importspec specifies which assets to import
      importspec: 'IMPORT_JSON_STRING'
    
  2. Replace the IMPORT_JSON_STRING variable with a JSON string that specifies the assets to import. The admin_username that you specify will be the owner of the imported assets, and must be set to the instance administrator.

    Example:

    catalog-api-aux:
      admin_username: cpadmin
      importspec: '{"catalog": {    "container_specs": [{    "guids": ["08f248d3-0c5f-4193-a16d-e3c0cc7f7d94"],    "all_assets": true}  ]} }'
    
    

    You can add the following import specifications to your YAML file to select which assets to import or to change the asset ownership:

    • containerIds: import-specific source catalogs into specific existing containers in the new cluster. The value is a JSON map of the source: target.

      Example:

      '{ "containerIds": { "default": "<target-id1>", "<original-id>": "<target-id2>" } }'
      
    • newContainerNameOverrides: import containers into new containers. You can give specific names to containers before they are created. Otherwise, new containers are given default names. The value is a JSON map of the source: target.

      Example:

      '{ "newContainerNameOverrides": { "<source-id1>": "<target-name1>", "<source-id2>": "<target-name2>" } }'
      
    • members: set the default membership of the imported assets when a catalog is imported. The catalog is automatically updated with its own membership settings that achieve the desired asset membership settings. Previous membership information is removed in favor of setting provided groups as members. The admin_username must be set to the instance administrator.

      Example:

      catalog-api-aux:
       admin_username: test-user-dd
       importspec: '{ "containerIds": {"default": "7345d4bc-b377-41e1-98b2-1e79767f8d97"}, "members":    { "default": [{"access_group_id": "10010", "roles": ["VIEWER"]}, {"access_group_id": "20010", "roles":   ["EDITOR"]}, {"access_group_id": "30010", "roles": ["OWNER"]}]} }'
      
    • duplicate_action: specify which action to take during asset creation. If no action is specified, "IGNORE" is used by default, and duplicates are saved as new assets. The following values are available:

      • "UPDATE" - Update the duplicate with the incoming changes.
      • "REJECT" - Do not import the asset if duplicate is found.
      • "REPLACE" - Overwrite the duplicate with the new asset.
      • "IGNORE" - Ignore the duplicates and create a new asset.
      Attention:

      duplicate_action is not supported for the connection asset type, connection asset type with a dynamic IP address, assets reviews, ratings, and notes.

      Example:

       importspec: '{ "duplicate_action": "REPLACE" }'
      

      In the following example, the duplicate strategy of the specified container is set to "UPDATE". By default, the other containers are using the "IGNORE" (allow duplicates) handling.

        importspec: '{ "containerIds": { "default": "${GUID}" }, "duplicate_actions": {"<source_container_id>": "UPDATE","default":"IGNORE"}}'
      
    • default_user: set the default user of the imported assets when a catalog is imported.

    • default_group: set the default group of the imported assets when a catalog is imported.

    • fail_on_user_mismatch: if a user does not exist on the target system, use the default_user provided with the import if the flag fail_on_user_mismatch is false. If you use the default fail_on_user_mismatch flag that is set to true, the import will fail.

    • fail_on_group_mismatch: if a group does not exist on the target system, use the default_group provided with the import if the flag fail_on_group_mismatch is false. If you use the default fail_on_group_mismatch flag that is set to true, the import will fail.

      Example:

       importspec: '{ "containerIds": {"default": "7345d4bc-b377-41e1-98b2-1e79767f8d97"}, "fail_on_user_mismatch": false, "fail_on_group_mismatch": false, "default_user":  "1000331093", "default_group": "10075" }'
      

      The user and group matching is based on the user name and the group name. If you do not want to migrate asset ownership, you can set the retain_asset_member parameter as false and provide the members in the import specification.

    • user_map: specify this parameter if you want to change the asset ownership. If the user_map is not specified, then by default, after the import, the assets will still belong to the same users and the same user groups based on the user name and group name.

      Example:

      You want to export a catalog from SystemA and import it to SystemB.

      • SystemA has the following three users and two user groups:

      users:
      admin: 1000330992
      user1: 1000330991
      user2: 1000331004

      groups:
      group1: 10013
      group2: 10014

      There are three assets with the following ownership in the catalog:

      asset1 - belongs to admin and group1
      asset2 - belongs to user1 and group2
      asset3 - belongs to user2 and group1

      • SystemB has the following four users and three user groups:

      users:
      admin: 1000330993
      user1: 1000330995
      user2: 1000331006
      user3: 1000331001

      groups:
      group1: 10011
      group2: 10014
      group3: 10018

      After the import is finished, the asset ownership remains unchanged as outlined earlier in the example, that is:

      asset1 - belongs to admin and group1
      asset2 - belongs to user1 and group2
      asset3 - belongs to user2 and group1

      However, if SystemB no longer has user1 and you want to change the ownership of asset2 to a different user, that is user3 on SystemB during the import, you can specify the following details:

       "user_map":{"users":{"1000330992":"1000330993","1000330991":"1000331001","1000331004":"1000331006"},"groups":{"10013":"10011","10014":"10014"}
      

      After the import is finished, the asset ownership changes to:

      asset1 - belongs to admin and group1
      asset2 - belongs to user3 and group2
      asset3 - belongs to user2 and group1

Importing catalog assets

To import catalog assets to the IBM Knowledge Catalog service, complete the following tasks. If you want to import from an export created on a different cluster, copy the exported archive to the current cluster, and log in to the current cluster.

Attention:

When you are importing a container package that was exported from another cluster, you might need to configure permissions to access the files within the archive. To verify whether configuration is needed, see Configuring permissions for cross cluster import operations.

Importing assets of one source container to multiple target containers is not permitted, while importing multiple source containers to one target container is allowed. This is to ensure that cross-container relationships are handled properly.

If the import fails or you want to reimport because you accidentally deleted some assets after the import, you must manually remove the previously imported assets from the catalog, before you run import again, as reimporting locally uploaded files, for example, a data asset with a local attachment, will cause the assets to lose the attachments and the assets won't work properly.

  1. Go to the directory where the cpd-cli command-line interface is installed.

  2. Upload the exported package:

    cpd-cli export-import export upload -n ${PROJECT_CPD_INST_OPERANDS} \
    -f Export_archive_file_name \
    --profile=${CPD_PROFILE_NAME}
    
  3. Set the IMPORT_NAME environment variable to the name that you want to use to identify the import job:

    export IMPORT_NAME=<name>
    
  4. Run the appropriate import command for your scope:

    • If you want to import all assets from the export file, use the following command:
      cpd-cli export-import import create ${IMPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
       --from-export ${EXPORT_NAME} \
       --backoff-limit=0 \
       --profile=${CPD_PROFILE_NAME}
      
    • If you want to select which assets to import, reference the YAML file created in Selecting the scope of the import for the following command:
      cpd-cli export-import import create ${IMPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
        --from-export ${EXPORT_NAME} \
        --values ${OVERRIDE_YAML_FILE} \
        --backoff-limit=0 \
        --profile=${CPD_PROFILE_NAME}
      
  5. Wait for import to complete. You can check the import status by using this command:

    cpd-cli export-import import status ${IMPORT_NAME} -n ${PROJECT_CPD_INST_OPERANDS} \
    --profile=${CPD_PROFILE_NAME}
    

Learn more

Parent topic: Administering IBM Knowledge Catalog