Troubleshooting the IBM Business Automation Machine Learning Server

Edit online

Draft comment:
This topic only applies to BAW, and is located in the BAW repository. Last updated on 2025-03-13 12:15

Use the information to identify and resolve problems that can occur while you are using the Machine Learning Server.

Checking the status of Machine Learning Server

To check the status of the server, run the following script:

./bin/ba-ml-server-status

If the containers of the configured services and NGINX containers are running properly, the results show State = Up, the NGINX container exposes its port to the host, and the containers of the configured services exposes its port only to the NGINX container.

Viewing Machine Learning Server logs

There are logs for Machine Learning Server and for the configured services. These logs are in the form:

Service-issuing-log | message

Each log message contains a unique request ID corresponding to an individual API call that you can use to trace a specific request from end to end.

To view the Machine Learning Server logs, type the following command:

./bin/ba-ml-server-logs

Running Intelligent Task Prioritization model training

If you get an error while running the Intelligent Task Prioritization model training, it might be because there are not enough instances in IBM® Business Automation Insights. Make sure that you have at least 30 completed tasks per user, for each type of task. You might have encountered the following errors:

The data retrieved from BAI server has size 0. Please make sure you have enough data from BAI
nextbesttask.custom_errors.InsufficientTrainingData: InsufficientTrainingData: the data retrieved from BAI server has size 0. Please make sure you have enough data from BAI 2023-11-03 14:09:24,012 - nbt - INFO - init - Please retrain the model later by accessing /nbt/train when there is sufficient data. Or wait for the regular training session which will happen automatically periodically

By default, model training is automatically started every Sunday at 3:00 AM UTC and also when the Machine Learning Server is restarted. For more information about scheduling retraining, see IBM Business Automation Workflow Runtime and Workstream Services parameters External link opens a new window or tab .

To manually retrain your data models, complete the following procedure:

Get the credential information from the secret with the suffix ibm-mls-itp-admin-secret.
Get the Intelligent Task Prioritization server certificate file location from the Intelligent Task Prioritization pod. The Intelligent Task Prioritization server pod name has the prefix <custom_resource_name>-mls-itp. For example, the JSON path might look similar to:
```
spec:
  containers:
    volumeMounts:
      ... ...
      - name: certificate-file
         mountPath: /nextbesttask/nextbesttask/certs/local-server.crt
         subPath: local-server.crt
      ... ...
```
Get the Intelligent Task Prioritization server service name, which is in the format <custom_resource_name>-mls-itp-service. Make sure that it shows up in the list of created services.
Go to your Intelligent Task Prioritization container in your OpenShift® Container Platform console, or by using the oc or kubectl command.

Using the information retrieved in the previous steps, complete your model training by running the following command:

curl https://<Intelligent_Task_Prioritization_service_name>:8000/train?dataset=bai -u <admin_username>:<admin_password> --cacert <server_certificate_filepath>

For example, the resulting command might look similar to:

curl https://cpfc-pg-wfs-fvt-bai-2302-mls-itp-service.auto-6vlw1.svc.cluster.local:8000/train?dataset=bai -u ITP-kcAMgSjz:password --cacert /nextbesttask/nextbesttask/certs/local-server.crt

Troubleshooting IBM Business Automation Insights

Because the Machine Learning Server services rely on Business Automation Insights for data, you can resolve some problems by following the instructions in Troubleshooting event emitters on Kubernetes.

Troubleshooting Business Automation Workflow

To enable logging for Business Automation Workflow, see the instructions in Enabling tracing for BPM event emitter and Machine Learning Server.