Upgrading IBM Knowledge Catalog from Version 4.8 to Version 5.1

An instance administrator can upgrade IBM Knowledge Catalog from IBM Cloud Pak® for Data Version 4.8 to IBM Software Hub Version 5.1.

Who needs to complete this task?

Instance administrator To upgrade IBM Knowledge Catalog, you must be an instance administrator. An instance administrator has permission to manage software in the following projects:

The operators project for the instance

The operators for this instance of IBM Knowledge Catalog are installed in the operators project. In the upgrade commands, the ${PROJECT_CPD_INST_OPERATORS} environment variable refers to the operators project.

The operands project for the instance

The custom resources for the control plane and IBM Knowledge Catalog are installed in the operands project. In the upgrade commands, the ${PROJECT_CPD_INST_OPERANDS} environment variable refers to the operands project.

When do you need to complete this task?

Review the following options to determine whether you need to complete this task:

  • If you want to upgrade the IBM Software Hub control plane and one or more services at the same time, follow the process in Upgrading an instance of IBM Software Hub instead.
  • If you didn't upgrade IBM Knowledge Catalog when you upgraded the IBM Software Hub control plane, complete this task to upgrade IBM Knowledge Catalog.

    Repeat as needed If you are responsible for multiple instances of IBM Software Hub, you can repeat this task to upgrade more instances of IBM Knowledge Catalog on the cluster.

Information you need to complete this task

Review the following information before you upgrade IBM Knowledge Catalog:

Version requirements

All the components that are associated with an instance of IBM Software Hub must be installed at the same release. For example, if the IBM Software Hub control plane is at Version 5.1.3, you must upgrade IBM Knowledge Catalog to Version 5.1.3.

Environment variables
The commands in this task use environment variables so that you can run the commands exactly as written.
  • If you do not have the script that defines the environment variables, see Setting up installation environment variables.
  • To use the environment variables from the script, you must source the environment variables before you run the commands in this task. For example, run:
    source ./cpd_vars.sh
Common core services
IBM Knowledge Catalog requires the IBM Software Hub common core services.

If the common core services are not at the correct version in the operands project for the instance, the common core services are automatically upgraded when you upgrade IBM Knowledge Catalog. The common core services upgrade increases the amount of time the upgrade takes to complete.

Before you begin

This task assumes that the following prerequisites are met:

Prerequisite Where to find more information
The cluster meets the minimum requirements for IBM Knowledge Catalog. If this task is not complete, see System requirements.
The workstation from which you will run the upgrade is set up as a client workstation and has the following command-line interfaces:
  • IBM Software Hub CLI: cpd-cli
  • OpenShift® CLI: oc
If this task is not complete, see Updating client workstations.
The IBM Software Hub control plane is upgraded. If this task is not complete, see Upgrading an instance of IBM Software Hub.
For environments that use a private container registry, such as air-gapped environments, the IBM Knowledge Catalog software images are mirrored to the private container registry. If this task is not complete, see Mirroring images to a private container registry.
For environments that use a private container registry, such as air-gapped environments, the cpd-cli is configured to pull the olm-utils-v3 image from the private container registry. If this task is not complete, see Pulling the olm-utils-v3 image from the private container registry.

Procedure

Complete the following tasks to upgrade IBM Knowledge Catalog:

  1. Specifying your IBM Knowledge Catalog edition
  2. Specifying installation options
  3. Reverting temporary patches
  4. Upgrading the service
  5. Validating the upgrade
  6. What to do next

Specifying your IBM Knowledge Catalog edition

You must specify which edition you want to upgrade.

Set the IKC_TYPE environment variable to the edition of IBM Knowledge Catalog that you want to upgrade:

IBM Knowledge Catalog
export IKC_TYPE=wkc

Specifying installation options

When you upgrade IBM Knowledge Catalog, the options that you specified when you installed IBM Knowledge Catalog are used.

Specify the following options in the install-options.yml file in the work directory only if you want to modify the behavior of IBM Knowledge Catalog.

################################################################################
# IBM Knowledge Catalog parameters
################################################################################
custom_spec:
  wkc:
#    enableDataQuality: False
#    enableSemanticAutomation: False
#    enableKnowledgeGraph: False
#    useFDB: False
Property Description
enableDataQuality Specify whether to enable data quality features in projects.
Important: If you enable this feature, DataStage, specifically DataStage Enterprise, is automatically installed.

If you did not purchase a DataStage license, use of DataStage Enterprise is limited to creating, managing, and running data quality rules. For examples of accepted use, see Enabling optional features after installation or upgrade for IBM Knowledge Catalog.

Editions the setting applies to
  • IBM Knowledge Catalog
  • IBM Knowledge Catalog Premium
Default value
False
Valid values
False
Do not enable the data quality feature.
True
Enable the data quality feature.
enableSemanticAutomation Specify whether to enable gen AI based enrichment features in projects.
Editions the setting applies to
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
Default value
False
Valid values
False
Do not enable the gen AI based enrichment features.
True
Enable the gen AI based enrichment features.
Important: If you enable this feature, the inference foundation models component (watsonx_ai_ifm) is automatically installed.

This option requires at least one GPU. For information about supported GPUs, see the Hardware requirements.

enableKnowledgeGraph Specify whether to enable the knowledge graph feature. The knowledge graph provides the following capabilities:
  • Relationship explorer and business term relationship search
  • Lineage
Important: The preceding features are not available in all environments. For more information, see Which knowledge graph features are available in my environment?
Editions the setting applies to
  • IBM Knowledge Catalog
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
Default value
False
Valid values
False
Do not enable the knowledge graph feature.
True
Enable the knowledge graph feature.

If you set enableKnowledgeGraph: True, review useFDB and Which knowledge graph features are available in my environment?.

useFDB If you set enableKnowledgeGraph: True, specify which database to use to store the data generated by knowledge graph.

The database depends on which features you want to use and whether you are installing or upgrading IBM Knowledge Catalog.

Default value
False
Valid values
False
Do not use FoundationDB. Use Neo4j.

5.1.1 and later If FoundationDB is already installed, you cannot set useFDB to false when you upgrade IBM Knowledge Catalog.

True
Use FoundationDB.

5.1.1 and later If FoundationDB is already installed, useFDB is automatically set to true when you upgrade IBM Knowledge Catalog.

Which knowledge graph features are available in my environment?
Use the following tables to determine which knowledge graph features will be available in your environment when you set enableKnowledgeGraph: True:
New installations
Lineage Relationship explorer Required settings
Not available if you do not install a lineage service. Available. Manually set:
enableKnowledgeGraph: True
useFDB: True
Available if you install MANTA Automated Data Lineage. Available. Manually set:
enableKnowledgeGraph: True
useFDB: True
Available if you install IBM Manta Data Lineage.
  • 5.1.0 Not available when IBM Manta Data Lineage is installed.
  • 5.1.1 and later Available.
Manually set:
enableKnowledgeGraph: True
Upgrades
Lineage Relationship explorer Required settings
Not available. No lineage service is installed. Available.
  • 5.1.0 Manually set:
    enableKnowledgeGraph: True *
    useFDB: True
  • 5.1.1 and later Manually set:
    enableKnowledgeGraph: True *

* You can omit this option if knowledge graph is already enabled.

MANTA Automated Data Lineage
(existing)
Available.
  • 5.1.0 Manually set:
    enableKnowledgeGraph: True *
    useFDB: True
  • 5.1.1 and later Manually set:
    enableKnowledgeGraph: True *

* You can omit this option if knowledge graph is already enabled.

MANTA Automated Data Lineage
(adding during upgrade)
Available.
  • 5.1.0 Manually set:
    enableKnowledgeGraph: True
    useFDB: True
  • 5.1.1 and later Manually set:
    enableKnowledgeGraph: True
IBM Manta Data
Lineage

(replacing MANTA Automated Data Lineage after upgrade)
  • 5.1.0 Not available.
  • 5.1.1 and later Available.
Settings are changed by patching the custom resource after upgrade. For more information, see Migrating from MANTA Automated Data Lineage to IBM Manta Data Lineage.
IBM Manta Data
Lineage

(adding during upgrade)
  • 5.1.0 Not available.
  • 5.1.1 and later Available.
  • 5.1.0 Manually set:
    enableKnowledgeGraph: True
    useFDB: False
  • 5.1.1 and later Manually set:
    enableKnowledgeGraph: True

Reverting temporary patches

If you applied any patches to your current installation of IBM Knowledge Catalog, check the patch instructions for cleanup steps and complete these before you start the upgrade. The patch instructions will contain cleanup instructions similar to the ones in this example: Installing the patch for version 5.0.3 in the IBM Cloud Pak for Data 5.0 documentation.

Upgrading the service

Important: The Operator Lifecycle Manager (OLM) objects for IBM Knowledge Catalog were updated when you upgraded the IBM Software Hub platform. The cpd-cli manage apply-olm updates all of the OLM objects in the operators project at the same time.

To upgrade IBM Knowledge Catalog:

  1. Log the cpd-cli in to the Red Hat® OpenShift Container Platform cluster:
    ${CPDM_OC_LOGIN}
    Remember: CPDM_OC_LOGIN is an alias for the cpd-cli manage login-to-ocp command.
  2. Update the custom resource for IBM Knowledge Catalog.

    Run the appropriate command to create the custom resource.

    Default installation (without installation options)
    cpd-cli manage apply-cr \
    --components=${IKC_TYPE} \
    --release=${VERSION} \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --license_acceptance=true \
    --upgrade=true
    Custom installation (with installation options)
    cpd-cli manage apply-cr \
    --components=${IKC_TYPE} \
    --release=${VERSION} \
    --cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
    --param-file=/tmp/work/install-options.yml \
    --license_acceptance=true \
    --upgrade=true

Validating the upgrade

IBM Knowledge Catalog is upgraded when the apply-cr command returns:
[SUCCESS]... The apply-cr command ran successfully

If you want to confirm that the custom resource status is Completed, you can run the cpd-cli manage get-cr-status command:

cpd-cli manage get-cr-status \
--cpd_instance_ns=${PROJECT_CPD_INST_OPERANDS} \
--components=${IKC_TYPE}

Upgrading IBM Knowledge Catalog with Knowledge Graph or MANTA Automated Data Lineage enabled

5.1.1If you are upgrading IBM Knowledge Catalog to version 5.1.1 or 5.1.0, and you have Knowledge Graph enabled, or MANTA Automated Data Lineage deployed, or you have both enabled, you must have the useFDB flag set to true in the install-options.yml file.

If you perform the upgrade while the useFDB flag is set to false, there will be issues with the upgrade, and the backup and restore processes.

Before upgrading, you can check to see if you have the useFDB flag set to true or false by running the following command:
oc get wkc wkc-cr -o jsonpath='{.spec.useFDB}'
If the useFDB is set to true, you can continue with the upgrade.
If you already upgraded IBM Knowledge Catalog and the useFDB flag was not set to true, you will need to run the following workarounds to fix the issue:
  1. Patch the IBM Knowledge Catalog custom resource (CR) for your edition, and set the useFDB flag to true.
  2. Delete the IBM Neo4J CR using the delete-cr command, otherwise issues will occur with the backup and restore process.

What to do next

Complete the following tasks before users can access the service:
  • The Analytics Engine powered by Apache Spark service is also upgraded automatically, but the instance for Analytics Engine powered by Apache Spark must be upgraded manually.
  • Run the global search bulk sync utility:
  • Migrate existing profiling results as described in Migrating profiling results.
  • If you upgraded from 4.8.0, 4.8.1, 4.8.2, or 4.8.3, you can optionally reorganize the database tables that hold the output of data quality checks by running the following commands:
    oc exec -n ${PROJECT_CPD_INST_OPERANDS} -it c-db2oltp-wkc-db2u-0 -- bash -c "export PATH=/mnt/blumeta0/home/db2inst1/sqllib/bin:$PATH; db2 connect to lineage; db2 'REORG TABLE DATAQUALITY."dq_issue"'; db2 'REORG TABLE DATAQUALITY."dq_referenced_asset"'; db2 'REORG TABLE DATAQUALITY."dq_score"'; db2 terminate;"
  • New features can cause an increased load on Db2. Consider scaling up Db2 as described in Scaling up Db2 for IBM Knowledge Catalog.
  • If you enabled gen AI based enrichment during the upgrade (enableSemanticAutomation: True), you must also set batch sizes for processing large tables. For details, see Gen AI based enrichment capabilities.

The service is ready to use. For more details, see Applying data governance.