Extracting historical data from a reporter database for cloud native analytics training

Edit online

To learn about cloud native analytics, you can install a historical data set to train the system. Learn how to extract historical data from a reporter database, such as Oracle.

Before you begin

Note: For a production system, it is recommended to train the system with live event data. The procedure of loading historical data to train the system is applicable to proof of concept (PoC) or test systems only.

Before you complete these steps, complete the following prerequisite items:

The ea-events-tooling container is installed by the operator. It is not started as a pod, and contains scripts to install data on the system, which can be run with the kubectl run command.
Find the values of image and image_tag for the ea-events-tooling container, from the output of the following command:
```
kubectl get noi <release_name> -o yaml | grep ea-events-tooling
```
Where <release_name> is the custom resource release name of your cloud deployment. For example, in the following output, image is ea-events-tooling, and image_tag is 2.0.14-20200120143838GMT.
```
kubectl get noi <release_name> -o yaml | grep ea-events-tooling
    --env=CONTAINER_IMAGE=image-registry.openshift-image-registry.svc:5000/default/ea-events-tooling:2.0.14-20200120143838GMT \
    --image=image-registry.openshift-image-registry.svc:5000/default/ea-events-tooling:2.0.14-20200120143838GMT \
```
Hybrid deployment: For a hybrid deployment, run the following command:
```
kubectl get noihybrid <release_name> -o yaml | grep ea-events-tooling
```
Where <release_name> is the custom resource release name of your hybrid deployment.
IBM® Netcool® for AIOps deployment: For an online or offline deployment (airgap) of Netcool Operations Insight® with IBM Cloud Pak for AIOps, find the values of image and image_tag from the noi-operator CSV file. Run the following command:
```
oc get csv <noi-operator> -o yaml | grep olm.relatedImage.NOI_ea-events-tooling: | awk -F ': ' '{print $2}'
```
Where <noi-operator> is the noi-operator CSV file name.
If you created your own docker registry secret, then patch your service account.
```
kubectl patch serviceaccount default -p '{"imagePullSecrets": [{"name": "noi-registry-secret"}]}'
```
Where noi-registry-secret is the name of the secret for accessing the Docker repository.
Note: As an alternative to patching the default service account with image pull secrets, you can add the following option to each kubectl run command that you issue:
```
--overrides='{ "apiVersion": "v1", "spec": { "imagePullSecrets":
      [{"name": "noi-registry-secret"}] } }'
```

About this task

Complete the following steps to extract data from an Oracle into a local file.

Procedure

Open a Bash shell on the server in which your Netcool Operations Insight cluster is running.
Change the project to your Netcool Operations Insight namespace, by running the following command:
```
oc project <NOI namespace name>
```
Where <NOI namespace name> is the name of the namespace in which Netcool Operations Insight is deployed.

Set the following environment variables:

export RELEASE=<noi_release_name>
export HTTP_PASSWORD=$(oc get secret $RELEASE-systemauth-secret -o jsonpath --template '{.data.password}' | base64 --decode)
export HTTP_USERNAME=$(oc get secret $RELEASE-systemauth-secret -o jsonpath --template '{.data.username}' | base64 --decode)

Where<noi_release_name> is the release name of your Netcool Operations Insight deployment.

Open a terminal window and start the ingesthttp pod, by running the following command:
```
oc run ingesthttp -i --tty --image=<values:global:image:repository>/ea-events-tooling:<image_tag> --env=LICENSE=accept --restart=Never --command bash
```
Where
- <values:global:image:repository> is the location of the ea-events-tooling container, as determined in the prerequisite steps.
- <image_tag> is the image version tag, as determined in the prerequisite steps.
Log in to the ingesthttp pod and run the following commands:
```
oc exec -ti ingesthttp /bin/bash

cd /app/bin

./generateConfigFromTemplate.sh ../etc/jdbctofile_template.yaml ../etc/currentConfig.yaml
```
This generates a YAML file. In this example, the currentConfig.yaml file is generated.
Edit the currentConfig.yaml file generated in step 5 5, by running the following command:
```
vim ../etc/currentConfig.yaml
```
In the currentConfig.yaml file, update the following section:
```
 jdbc:
    username: <USERNAME>
    password: <PASSWORD>
    jdbcClassName: <JDBC_CLASS_NAME>
    url: <JDBC_CONNECTION_URL>
    table: <TABLE_NAME>
    filter: <FILTER_VALUE>
    orderby: <ORDERBY_FIELD>
```
Where
- <USERNAME> is the user ID of your database.
- <PASSWORD> is the password to log in to your database.
- <JDBC_CLASS_NAME>
  - For Oracle database, the class name is oracle.jdbc.OracleDriver.
- <JDBC_CONNECTION_URL>
  - For Oracle database, the URL is jdbc:oracle:thin:@//<Oracle hostname>:<Oracle port>/<Oracle service name or Oracle SID>.
- <TABLE_NAME> is the JDBC table name.
- <FILTER_VALUE> is (type <> 2 and type <> 21) by default.
- <ORDERBY_FIELD> value is firstoccurence by default.
Note: Do not extract more than 12 of months data. Either specify an appropriate date filter for the SQL extraction or truncate the generated JSON output file (reporter_status.json.gz).

Note: Steps 8, 9, and 10 apply only when you extract data from an Oracle database.
Open another terminal window and get the full path of the Oracle JDBC driver from the Impact pod, by running the following command:
```
oc exec $(oc get pods|grep nciserver-0|awk '{print $1}') -- bash -c "ls -l /opt/IBM/tivoli/impact/lib3p/ojdbc*"
```
Extract the JAR file from the Impact pod to the local directory, as in the following example:
```
oc cp $(oc get pods|grep nciserver-0|awk '{print $1}'):/opt/IBM/tivoli/impact/lib3p/ojdbc6-11.2.0.4.jar ojdbc6-11.2.0.4.jar
```
In this example, the name of the Oracle JDBC driver is ojdbc6-11.2.0.4.jar.
Copy the extracted JAR file into the running ingesthttp pod, as in the following example:
```
oc cp ojdbc6-11.2.0.4.jar ingesthttp:/app/lib
```
From the ingesthttp pod, run the following command:
```
./runclass.sh com.ibm.itom.ea.events.tooling.JdbcToFile -c file:/app/etc/currentConfig.yaml
```
This generates an output file (for example, /app/bin/reporter_status.json.gz) that includes the data for ingestion and training.
Exit from the running pod.
Copy the file that is generated from the output of the command in step 11 to the local file system, as in the following example:
```
oc cp ingesthttp:/app/bin/reporter_status.json.gz ./reporter_status.json.gz
```
Note: Depending on your reporter database configuration, it might be necessary to transpose fields during a subsequent replay of events. To transpose fields during the replay of events, add the following variables to the ingesnoi pod, which runs the filetonoi.sh script:
```
--env=OUTPUT_EVENT_FIELDS_TRANSPOSE_FIELDS= "{TargetField:SourceField}"
```
If, for example, the reporter database contains the field ORIGINALSEVERITY instead of Severity, map the fields as follows:
```
--env=OUTPUT_EVENT_FIELDS_TRANSPOSE_FIELDS="{ Severity : ORIGINALSEVERITY }"
```
For more information about working with the ingesnoi pod, see Training with local data.

What to do next

You can now use the JSON output file (reporter_status.json.gz) that you created to run your data through the system as described in Training with local data.