Installing OpenSearch and migrating Elasticsearch data

If your Cloud Pak for Business Automation deployment includes Elasticsearch, when you upgrade the Cloud Pak for Business Automation operators, Elasticsearch is replaced by OpenSearch. As part of the preparation to upgrade, if you use Business Automation Insights you need to assess the time it takes to migrate the data that you have in Elasticsearch.

Before you begin

A migration script is provided and this script supports different strategies for data migration. The various strategies help you to minimize the downtime of your production system. Select an appropriate migration strategy before you begin the full upgrade process.
Important: It is recommended to first perform the upgrade in a test environment with similar Business Automation Insights document volumes to understand the performance implications of your data loads.

About this task

This task migrates documents from the internal Elasticsearch instance to a new OpenSearch instance.

Important:
  • You can skip these steps if you do not have Business Automation Insights or Business Automation Workflow components installed.
  • If you do not want to keep any of your Business Automation Insights data and Business Automation Workflow saved searches, you can skip these steps.
  • If you have a size-able volume of Business Automation Insights data, it is highly recommended that you fully review the procedures and test this in a test environment with similar hardware and network performance.

You can use this testing to better understand your expected migration speeds to influence your premigration decisions.

Procedure

  1. Download cert-kubernetes to a Linux® based machine (RHEL) or macOS using Git.

    All the installation and upgrade artifacts are contained in the cert-kubernetes repository. It is recommended to use the latest interim fix of the release, but if you do need a previous version then you can find them all in the Cloud Pak for Business Automation download document.

    Tip: Use the move right arrow Move right arrow in the 24.0.0 release to find all the available interim fixes.

    To download the cert-kubernetes repository.

    1. Open the Cloud Pak for Business Automation download document, find the card for the latest 24.0.0 interim fix, click Cert Kubernetes, and then select and copy the displayed command.
    2. Run the copied git clone command to download the files.
  2. Log in to the target cluster from a client.
    oc login https://<CLUSTERIP>:<port> -u <ADMINISTRATOR>
  3. Change the directory to the scripts folder under cert-kubernetes/scripts.
  4. Run the cp4a-deployment.sh script from cert-kubernetes in the upgradeOperator mode to install OpenSearch.
    export CP4BA_NS=<cp4ba>
    ./cp4a-deployment.sh -m upgradeOperator -n $CP4BA_NS

    Where <cp4ba> is the target namespace.

    1. If the script detects Elasticsearch and Business Automation Insights, the script then asks if you completed the migration from Elasticsearch to OpenSearch. Select No.
      Note: If the script does not detect Elasticsearch usage, then ensure that you reviewed the note at the beginning of this procedure. You can skip these steps if OpenSearch migration doesn't apply to your environment.
    2. If the script detects Elasticsearch and no Business Automation Insights component, then the script asks if you are planning to use Process Federation Server after the upgrade and if you want to do the migration of Elasticsearch to OpenSearch? (Yes/No). Select No.
    3. Select Yes to the question on whether to install OpenSearch.
    4. The script detects which namespace Cloud Pak foundational services is installed in.

      The script displays that the Cloud Pak foundational services is migrated from cluster-scoped to namespace-scoped and asks whether you want to install OpenSearch in the global catalog namespace or a private catalog namespace. Select Yes to the question on whether you want to install OpenSearch using a private catalog.

    5. Select either small, medium, or large as the deployment profile for OpenSearch.
      After you select the deployment profile, the script takes the following actions:
      • Creates the ibm-cs-opensearch-catalog CatalogSource.
      • Creates the ibm-elasticsearch-operator Subscription.
      • Creates the opensearch-tls-issuer Issuer.
      • Creates the OpenSearch cluster pod.
      • Creates OpenSearch Route and provides URL access to OpenSearch service.
        Attention: The script provides a list of Next Actions in a set of steps in the output, although you should review this documentation for the needed steps.
  5. The script ends with the Next Actions that you need to perform manually. To initiate data migration, run all the commands the cp4a-deployment.sh script output shows during the OpenSearch installation to set the environment variables.
    The following variables provide an example of the output. The values for your cluster are included in the output.
    Note: Make sure that you include the port number (443) for the Elasticsearch URL when you are using the commands from the script output.
    export ELASTICSEARCH_URL=https://elasticsearch.abc.com:443
    export ELASTIC_USERNAME="elasticsearch-admin"
    export ELASTIC_PASSWORD=$(kubectl get secret iaf-system-elasticsearch-es-default-user -n production --no-headers --ignore-not-found -o jsonpath={.data.password} | base64 -d)
    export OPENSEARCH_URL=https://opensearch.abc.com
    export OPENSEARCH_USERNAME="elastic"
    export OPENSEARCH_PASSWORD=$(kubectl get secret opensearch-ibm-elasticsearch-cred-secret -n production --no-headers --ignore-not-found -o jsonpath={.data.elastic} | base64 -d)

    Where abc.com is the cluster ID.

  6. Run the following curl commands to evaluate how many documents you want to migrate.
    All indexes
    curl -X GET -u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD} --insecure "${ELASTICSEARCH_URL}/_cat/indices?v&s=docs.count:desc,index"
    Indexes available for premigration procedures
    curl -X GET -u ${ELASTIC_USERNAME}:${ELASTIC_PASSWORD} --insecure "${ELASTICSEARCH_URL}/_cat/indices?v&s=docs.count:desc,index" | grep -v -e "store" -e "pfs" -e "active"
    Note: If you can determine the proper Elastic variable values from step 5, you can run these curl commands before you start the upgrade.
  7. If you are using Business Automation Insights and have over 100,000 documents in the indexes available for premigration, then it is recommended to review the premigration procedures. For more information, see Premigrating Business Automation Insights data to OpenSearch.

    You will continue these steps after you complete the premigration of Business Automation Insights index data or choosing not to do premigration.

  8. If you have any of your Content Platform Engine Object Stores enabled to emit Business Automation Insights event, then disable the event emitter. For example, Object Stores created by using the CR with the setting oc_cpe_obj_store_enable_content_event_emitter: true or Object Stores that has the subscription ContentEventEmitterSubscription.
    1. Log in to the Administration Console for Content Platform Engine.
    2. Go to Object Stores > object store name > Events, Actions, Processes > Subscriptions.
    3. Click ContentEventEmitterSubscription or the name of the existing subscription that is used by the Content event emitter.
    4. Click the Properties tab.
    5. Under Property Name, for the row Is Enabled, click the Property Value drop-down and select False.
    6. Click Save.
  9. Run the curl command that the script shows to create Flink savepoints and to stop the Flink jobs.
    The following command is an example.
    curl -X POST -k -u management-admin:ajvDDkRCcCtcOLid "https://cpd-olm-test.apps.cp4ba-svl-shared-olm.cp.fyre.ibm.com/bai-management/api/v1/processing/jobs/savepoints?cancel-job=true" -o /tmp/tmp_bai.json && [ "$(cat /tmp/tmp_bai.json)" != "[]" ] && /usr/bin/cp -rf /tmp/tmp_bai.json /home/cert-kubernetes/scripts/cp4ba-upgrade/project/olm-test/custom_resource/bai.json || rm -rf /tmp/tmp_bai.json
  10. Run the commands that the script shows to scale down the Business Performance Center (BPC) to prevent the generation of dirty data during the migration.

    The following commands are an example.

    oc scale --replicas=0 deployment.apps/ibm-insights-engine-operator -n $CP4BA_NS
    oc scale --replicas=0 deployment.apps/icp4adeploy-insights-engine-cockpit -n $CP4BA_NS
  11. Migrate the Elasticsearch documents depending on your premigration choices.
    • If you followed the premigration procedures.

      Run the following commands to migrate the remaining data.

      echo $PREMIGRATION_END_DATE
       ./es-os-migration-script.sh startdate=$PREMIGRATION_END_DATE -exclude_regex=active,icp4ba-bai-store,icp4ba-pfs@
       ./es-os-migration-script.sh include_regex=active,icp4ba-bai-store,ibmpfssavedsearches
    • If you skipped the premigration procedures.
      1. Change the directory to the migration folder under cert-kubernetes/scripts.
        cd cpfs/migration
      2. Run the following command to migrate Business Automation Workflow saved searches only.
        ./es-os-migration-script.sh -include_regex=ibmpfssavedsearches
      3. Run the following command to migrate Business Automation Insights data.
        ./es-os-migration-script.sh -exclude_regex=icp4ba-pfs@*
        Note: It is not intended to migrate any icp4ba-pfs@* indexes except for the saved searches.
  12. Run the following command to verify that all Business Automation Insights and or savedsearches indexes are migrated.
    curl -X GET -u ${OPENSEARCH_USERNAME}:${OPENSEARCH_PASSWORD} --insecure "${OPENSEARCH_URL}/_cat/indices?v&s=docs.count:desc,index"

What to do next

The migration process can take minutes, days, or weeks based on your Elasticsearch data. After you complete the migration, go to Upgrading the CP4BA operators from 23.0.2.