Training with local data

To learn about cloud native analytics, you can install a local data set. Learn how to install and load your local data, train the system, and see the results.

Before you begin

Before you complete these steps, complete the following prerequisite items:
  • The ea-events-tooling container is installed by the operator. It is not started as a pod, and contains scripts to install data on the system, which can be run with the kubectl run command.
  • Find the values of image and image_tag for the ea-events-tooling container, from the output of the following command:
    kubectl get noi <release_name> -o yaml | grep ea-events-tooling
    Where <release_name> is the custom resource release name of your cloud deployment. For example, in the following output, image is ea-events-tooling, and image_tag is 2.0.14-20200120143838GMT.
    kubectl get noi <release_name> -o yaml | grep ea-events-tooling
        --env=CONTAINER_IMAGE=image-registry.openshift-image-registry.svc:5000/default/ea-events-tooling:2.0.14-20200120143838GMT \
        --image=image-registry.openshift-image-registry.svc:5000/default/ea-events-tooling:2.0.14-20200120143838GMT \
    Hybrid deployment: For a hybrid deployment, run the following command:
    kubectl get noihybrid <release_name> -o yaml | grep ea-events-tooling
    Where <release_name> is the custom resource release name of your hybrid deployment.
    IBM® Netcool® for AIOps deployment: For an online or offline deployment (airgap) of Netcool Operations Insight® with IBM Cloud Pak for AIOps, find the values of image and image_tag from the noi-operator CSV file. Run the following command:
    oc get csv <noi-operator> -o yaml | grep olm.relatedImage.NOI_ea-events-tooling: | awk -F ': ' '{print $2}'
    Where <noi-operator> is the noi-operator CSV file name.
  • Determine the HTTP username and password from the secret that has systemauth in the name, by running the following command:
    kubectl get secret release_name-systemauth-secret -o yaml
  • If you created your own docker registry secret, then patch your service account.
    kubectl patch serviceaccount default -p '{"imagePullSecrets": [{"name": "noi-registry-secret"}]}'
    Where noi-registry-secret is the name of the secret for accessing the Docker repository.
    Note: As an alternative to patching the default service account with image pull secrets, you can add the following option to each kubectl command that you issue:
    --overrides='{ "apiVersion": "v1", "spec": { "imagePullSecrets":
          [{"name": "evtmanager-registry-secret"}] } }'

About this task

You can use scripts in the ea-events-tooling container to install local data on the system. To complete this task, run the filetoingestionservice.sh, getTimeRange.sh, runTraining.sh, createPolicy.sh, and filetonoi.sh scripts. The scripts load local data to the ingestion service, train the system for seasonality and temporal events, create live seasonal policies and suggest temporal policies, and load the data into cloud native analytics.

Procedure

  1. Send local data to the ingestion service. Run the filetoingestionservice.sh script:
    export RELEASE=release_name
    export HTTP_PASSWORD=$(kubectl get secret $RELEASE-systemauth-secret -o jsonpath --template '{.data.password}' | base64 --decode)
    export HTTP_USERNAME=$(kubectl get secret $RELEASE-systemauth-secret -o jsonpath --template '{.data.username}' | base64 --decode)
    
    cat mydata.json.gz | kubectl run ingesthttp -i --restart=Never --image=image:image_tag --env=INPUT_FILE_NAME=stdin --env=LICENSE=accept --env=HTTP_USERNAME=$HTTP_USERNAME --env=HTTP_PASSWORD=$HTTP_PASSWORD filetoingestionservice.sh $RELEASE
    Where:
    • release_name is the custom resource release name of your deployment.
    • mydata.json.gz is the path to your local compressed data file.
    • image is the location of the ea-events-tooling container. The image value can be found from the kubectl get noi command, as described earlier.
    • image_tag is the image version tag, as described earlier.
    • You can override the username and password by using HTTP_USERNAME and HTTP_PASSWORD.
    Note: If you specify the --env=INPUT_FILE_NAME=stdin parameter, you can send your local data to the scripts by using the -i option with the kubectl run command. This option links the stdin parameter on the target pod to the stdout parameter.
  2. For a geo-redundant deployment, run the following command to limit the event ingestion rate from your local file.
    cat <MY_FILE> | kubectl run ingesthttp -i --restart=Never --env=EVENT_RATE_ENABLED=true 
    --env=EVENT_RATE_PSEC=10 --image=image_name --env=INPUT_FILE_NAME=stdin 
    --env=LICENSE=accept --env=HTTP_USERNAME=<HTTP_USERNAME> 
    --env=HTTP_PASSWORD=<HTTP_PASSWORD> 
    filetoingestionservice.sh <release_name>
    
  3. Use the getTimeRange.sh script to calculate the training time range. If no time range is specified, the trainer trains against all rows for the tenant ID. Instead of using all data associated with the tenant ID to train the system, run the following command to find the start and end timestamps of the data:
    cat mydata_json.gz | kubectl run ingesnoi3 -i --restart=Never --env=LICENSE=accept --image-pull-policy=Always  --image=image:image_tag  getTimeRange.sh stdin
    Output similar to the following example is displayed:
    {"minfirstoccurence":{"epoc":1540962968226,"formatted":"2018-10-31T05:16:08.226Z"},"maxlastoccurrence":{"epoc":1548669553896,"formatted":"2019-01-28T09:59:13.896Z"}}
  4. Train the system with the new data. Run the runTraining.sh script:
    kubectl run trainer -it --command=true --restart=Never --env=LICENSE=accept --image=image:image_tag runTraining.sh -- -r release_name [-t tenantid] [-a algorithm] [-s start-time] [-e end-time] [-d auto-deploy]
    Where:
    • release_name is the custom resource release name of your deployment.
    • image is the location of the ea-events-tooling container. The image value can be found from the kubectl get noi command, as described earlier.
    • image_tag is the image version tag, as described earlier.
    • algorithm is either related-events or seasonal-events. If not specified, defaults to related-events.
    • Optional: tenantid is the tenant ID associated with the data that is ingested, as specified by the global.common.eventanalytics.tenantId parameter in the values.yaml file that is associated with the operator.
    • Optional: start-time and end-time are the start and end times to train against. These values are provided in the command output from step 3. You can specify the start or end time, neither, or both. If neither are specified, the current time is used as the end time and the start time is 93 days before the end time. You can either specify the start and end times with an integer Epoch time format in milliseconds, or with the default date string formatting for the system. Run the ./runTraining.sh -h command to determine the default date formatting.
    • Optional: auto-deploy Set to true to deploy policies immediately. Set to false to review policies before deployment.
      Note: If Review first mode is enabled by setting the -d parameter to false, Seasonality policies cannot be activated from the Suggested policies tab. Only Temporal Groupings policies can be activated from the Suggested policies tab.
  5. Create a policy. A policy can be created through the UI, or you can specify a policy by running the createPolicy.sh script:
    export ADMIN_PASSWORD=$(kubectl get secret release_name-systemauth-secret -o jsonpath --template '{.data.password}'  | base64 --decode)
    kubectl run createpolicy  --restart=Never --image=image:image_tag --env=LICENSE=accept --env=ADMIN_PASSWORD=${ADMIN_PASSWORD} createPolicy.sh -- -r release_name
    Where:
    • release_name is the custom resource release name of your deployment.
    • image is the location of the ea-events-tooling container. The image value can be found from the kubectl get noi command, as described earlier.
    • image_tag is the image version tag, as described earlier.
    Note: This step creates a policy that maps the node to resource/name by default. If you want to map to resource/hostname or resource/ipaddress instead, specify the --env=CONFIGURATION_PROPERTIES=resource/hostname|ipaddress parameter.
  6. Send local data to the ObjectServer with or without using Secure Sockets Layer (SSL). Choose one of the following methods.
    Send local data to the ObjectServer with SSL:

    For a hybrid deployment, you can use Secure Sockets Layer (SSL) to connect to the on-premises ObjectServer. The following script correctly mounts the SSL certificates.
    #!/bin/bash
    #set -x
    
    #
    # Start of section of variables that may be overridden
    #
    
    # TO DO, READ THE FOLLOWING FROM ARGS IN THIS SCRIPT...
    typeset TOOLING_IMAGE=${HYBRID_FILE_TO_OMNIBUS_TOOLING_IMAGE:-hyc-hdm-staging-docker-virtual.artifactory.swg-devops.com/ea/ea-events-tooling:latest}
    typeset NOI_RELEASE=${HYBRID_FILE_TO_OMNIBUS_NOI_RELEASE:-netcool}
    typeset ABSOLUTE_FILE_PATH=${HYBRID_FILE_TO_OMNIBUS_ABSOLUTE_FILE_PATH:-./testdata.json.gz}
    typeset OMNIBUS_USERNAME=${OMNIBUS_USERNAME:-root}
    typeset OMNIBUS_PASSWORD_SECRET_KEY_VALUE=${OMNIBUS_PASSWORD_SECRET_KEY_VALUE:-OMNIBUS_ROOT_PASSWORD}
    typeset CA_SECRET_NAME=${HYBRID_FILE_TO_OMNIBUS_CA_SECRET_NAME:-${NOI_RELEASE}-omni-certificate-secret}
    typeset OBJECTSERVER_PORT=${HYBRID_FILE_TO_OMNIBUS_OBJECTSERVER_PORT:-7111}
    typeset OBJECTSERVER_HOST=${HYBRID_FILE_TO_OMNIBUS_OBJECTSERVER_HOST:-hybrid1.xyz.com}
    
    # THESE ARE VARIABLES USED IN DETERMINING THE CA FILE NAME AND THE CONTAINER MOUNT DIRECTORY
    typeset MOUNT_SECRET_FILE_NAME=${HYBRID_FILE_TO_OMNIBUS_MOUNT_SECRET_FILE_NAME:-rootca}
    typeset MOUNT_SECRET_DIRECTORY=${HYBRID_FILE_TO_OMNIBUS_MOUNT_SECRET_DIRECTORY:-/ca}
    
    # THESE ARE VARIABLES THAT DECLARE WHICH KEY OF THE SECRET TO USE
    typeset CA_SECRET_CERTIFICATE_PROPERTY_KEY=${HYBRID_FILE_TO_OMNIBUS_CA_SECRET_CERTIFICATE_PROPERTY_KEY:-ROOTCA}
    
    # INTERNAL VOLUME CONFIGS
    typeset VOLUME_NAME=${HYBRID_FILE_TO_OMNIBUS_VOLUME_NAME:-ca}
    
    # IMAGE AND SERVICE ACCOUNT
    typeset DOCKER_REGISTRY_SECRET=${HYBRID_FILE_TO_OMNIBUS_DOCKER_REGISTRY_SECRET:-noi-registry-secret}
    typeset SERVICE_ACCOUNT_NAME=${HYBRID_FILE_TO_OMNIBUS_SERVICE_ACCOUNT_NAME:-noi-service-account}
    
    #
    # End of environment variables that may be overridden.
    #
    
    # THESE ARE VARIABLES USED IN GENERATING THE POD NAME
    typeset FORMATTED_DATE=$(date '+%Y%m%d%H%M%S')
    typeset POD_NAME=hybrid-file-to-omnibus-${FORMATTED_DATE}
    
    cat "$ABSOLUTE_FILE_PATH" | kubectl run ${POD_NAME} \
      -i \
      "$@" \
      --restart=Never \
      --image=$TOOLING_IMAGE \
      --overrides='
      { 
        "apiVersion": "v1",
        "kind": "Pod",
        "spec": {
          "imagePullSecrets": [
            {
              "name": "'${DOCKER_REGISTRY_SECRET}'"
            }
          ],
          "serviceAccount": "'${SERVICE_ACCOUNT_NAME}'",
          "serviceAccountName": "'${SERVICE_ACCOUNT_NAME}'",
          "volumes": [{
            "name":"'${VOLUME_NAME}'",
            "secret":{
              "defaultMode": 420,
              "items": [{
                "key": "'${CA_SECRET_CERTIFICATE_PROPERTY_KEY}'",
                "path": "'${MOUNT_SECRET_FILE_NAME}'" 
              }],
              "secretName": "'${CA_SECRET_NAME}'"
            }
          }],
          "containers": [
            { 
              "name": "'${POD_NAME}'",
              "image": "'${TOOLING_IMAGE}'",
              "volumeMounts": [
                {
                  "mountPath": "'${MOUNT_SECRET_DIRECTORY}'",
                  "name": "'${VOLUME_NAME}'",
                  "readOnly": true
                }
              ],
              "args": [ 
               "filetonoi.sh", "'${NOI_RELEASE}'","'${OBJECTSERVER_HOST}'","'${OBJECTSERVER_PORT}'"
              ], 
              "env": [
                {"name": "LICENSE", "value": "accept"},
                {"name": "INPUT_FILE_NAME", "value": "stdin"},
                {"name": "INPUT_REPORTERDATA", "value": "true"},
                {"name": "EVENT_REPLAY_ENABLED", "value": "false"},
                {"name": "LOG4J_OPTS", "value": " -Dlog4j.configuration=file:/app/etc/log4j.properties -Deatools.loglevel=INFO "},
                {"name": "JDBC_USERNAME", "value": "'${OMNIBUS_USERNAME}'"},
                {"name": "JDBC_PASSWORD", "valueFrom": 
                  {
                    "secretKeyRef": {
                      "name": "'${NOI_RELEASE}'-omni-secret",
                      "key": "'${OMNIBUS_PASSWORD_SECRET_KEY_VALUE}'",
                      "optional": true
                    }
                  }
                }
              ], 
              "stdin": true,
              "stdinOnce": true,
              "securityContext": {
                "runAsUser": 1000 
              } 
            }
          ]
        }   
      }'
    
    1. Find the OMNIbus password, if this is not known, with the following command:
      export OMNIBUS_ROOT_PASSWORD=$(kubectl get secret omni-secret  -o jsonpath --template '{.data.OMNIBUS_ROOT_PASSWORD}' | base64 --decode)
      Where omni-secret is the name of the OMNIbus secret as specified by global.omnisecretname in your installation parameters, usually release_name-omni-secret, where release_name is the custom resource release name of your deployment.
    2. Ensure that the secret is set up correctly, as in the following example:
      oc create secret generic my-hybrid-omni-certificate-secret
            --from-literal=PASSWORD=omnibusauto  --from-file=ROOTCA="/home/abc/tmp/ncoms-ca-cert.arm"
            --from-literal=INTERMEDIATECA="" --namespace noi163
      Where the /home/abc/tmp/ncoms-ca-cert.arm file is the SSL root CA file that is copied from the on-premises ObjectServer.
      Note: The my-hybrid-omni-certificate-secret secret might exist, in which case check its contents first, and then determine how to update it accordingly.
    3. Before invoking the script, some of the default environment variables need to be overridden, as in the following example:
      export HYBRID_FILE_TO_OMNIBUS_TOOLING_IMAGE=<tooling_image>
      export HYBRID_FILE_TO_OMNIBUS_ABSOLUTE_FILE_PATH=/home/abc/tmp/testdata.json.gz
      export HYBRID_FILE_TO_OMNIBUS_NOI_RELEASE=my-hybrid
      export HYBRID_FILE_TO_OMNIBUS_OBJECTSERVER_HOST=wibble1.abc.ibm.com
      export HYBRID_FILE_TO_OMNIBUS_OBJECTSERVER_PORT=7111
      Where:
      • <tooling_image> can be determined by running the following command:
        kubectl get noihybrid release_name -o yaml | grep ea-events-tooling
      • The /home/abc/tmp/testdata.json.gz file must contain the events to be sent to the on-premises ObjectServer over the SSL connection.
      Note: If the variable OMNIBUS_USERNAME is not set in the environment from which the script is invoked, then within the script the variable OMNIBUS_USERNAME has root as the default value. You can override the variable OMNIBUS_USERNAME by using the command export OMNIBUS USERNAME=anotheruser. Then, the default value for the variable OMNIBUS_USERNAME will be replaced by the value anotheruser within the script.
    4. After the environment variables are exported, run the wrapper script:
      ./file-to-omnibus-ssl.sh
    Send local data to the ObjectServer without SSL:

    To send local data to the ObjectServer with the filetonoi.sh script, complete the following steps.
    1. Find the OMNIbus password, if this is not known, with the following command:
      export OMNIBUS_ROOT_PASSWORD=$(kubectl get secret omni-secret -o jsonpath --template '{.data.OMNIBUS_ROOT_PASSWORD}' | base64 --decode)
      Where omni-secret is the name of the OMNIbus secret as specified by global.omnisecretname in your installation parameters, usually release_name-omni-secret, where release_name is the custom resource release name of your deployment.
    2. Create the ingesnoi pod.
      Draft comment: DEIRDRELAWTON
      June 2020 #4638
      Hybrid deployment: Use this command if you have a hybrid deployment (on-premises Netcool Operations Insight with cloud native analytics on a container platform):
      kubectl run ingesnoi -i --restart=Never --image=image:image_tag --env=INPUT_FILE_NAME=stdin --env=LICENSE=accept --env=EVENT_RATE_ENABLED=false --env=EVENT_FILTER_ENABLED=true --env=EVENT_REPLAY_REGULATE=true --env=EVENT_REPLAY_SPEED=60 --env=EVENT_REPLAY_ENACT_DELETIONS=false --env=JDBC_PASSWORD=<ospassword> --env=RUNNING_IN_CONTAINER=true filetonoi.sh -- <release_name> <oshostname> <osport>
      Cloud deployment: Use this command if you have a full Operations Management deployment on a container platform:
      kubectl run ingesnoi -i --restart=Never --image=image:image_tag --env=INPUT_FILE_NAME=stdin --env=LICENSE=accept --env=EVENT_RATE_ENABLED=false --env=EVENT_FILTER_ENABLED=true --env=EVENT_REPLAY_REGULATE=true --env=EVENT_REPLAY_SPEED=60 --env=EVENT_REPLAY_ENACT_DELETIONS=false --env=JDBC_PASSWORD=<ospassword> --env=RUNNING_IN_CONTAINER=true filetonoi.sh -- <release_name>
      Draft comment: DEIRDRELAWTON
      May 2021 #7362 added --env=RUNNING_IN_CONTAINER=true
      Where:
      • image is the location of the ea-events-tooling container. The image value can be found from the kubectl get noi command, as described earlier.
      • image_tag is the image version tag, as described earlier.
      • ospassword is the on-premises ObjectServer password.
      • oshostname is the on-premises ObjectServer host (on hybrid installs only)
      • osport is the on-premises ObjectServer port (on hybrid installs only)
      You can specify a user ID and password by using JDBC_USERNAME and JDBC_PASSWORD. These parameters correspond to the user ID and password of the ObjectServer.
      Note: You can view the available overrides and their default values by using the --help command option.
    3. Copy the file to be replayed into the ingesnoi pod.
      kubectl cp mydata.json.gz ingesnoi:/tmp
    4. Exec into the ingesnoi pod, and then replay the file into OMNIbus.
      kubectl exec -it ingesnoi -- /bin/bash
      cd bin; cat /tmp/mydata.json.gz | ./filetonoi.sh <release_name>
      Where <release_name> is the release name of your deployment.
    If you are unable to replay some events, include the following parameters:
    --env=INPUT_REPORTERDATA=false
    --env=EVENT_REPLAY_TEMPORALITY_PRIMARY_TIMING_FIELD=LastOccurrence
    --env=EVENT_REPLAY_PREEMPTIVE_DELETIONS=false
    --env=EVENT_REPLAY_SKIP_DELETION=false
    --env=INPUT_TAG_TOOL_GENERATED_EVENTS=false
    If you encounter more data integrity issues, include the following parameters:
    --env=EVENT_REPLAY_STRICT_DELETIONS=false
  7. View the data.
    1. Connect to WebGUI.
    2. Select Incident > Events > Event Viewer. Select the All Events filter. The list of all events is displayed.
    3. Select the Example_IBM_CloudAnalytics view to see how the events from the local data are grouped.
    Note: You may be required to override the registry secret by adding the following option to all of the kubectl commands that you issue:
    --overrides='{ "apiVersion": "v1", "spec": { "imagePullSecrets":
          [{"name": "evtmanager-registry-secret"}] } }'