Known issues and limitations for Watson Machine Learning

The following known issues and limitations apply to Watson Machine Learning.

Known issues for AutoAI

Importing an AutoAI notebook from a catalog can result in runtime error

Known issues for Watson Machine Learning

Restoring Watson Machine Learning to a different namespace requires pod restarts
After upgrading Watson Machine Learning some of the existing deployments might not be visible in UI
Upgrading Watson Machine Learning might fail because of runtime errors
Upgrading Watson Machine Learning might fail because of orphaned objects
Decision Optimization deployment job fails with error: "Add deployment failed with deployment not finished within time"
Configuring runtime definition for a specific GPU node fails
A certification path error appears when users inference online deployments by using Java
Deep Learning experiment fails to save trained models with "Asset not found" error

Limitations for AutoAI experiments

AutoAI file gets pushed to Git repository in default Git projects
Maximum number of feature columns in AutoAI experiments

Limitations for Watson Machine Learning

Deep Learning experiments with storage volumes in a Git enterprise project are not supported
Deep Learning jobs are not supported on IBM Power (ppc64le) or Z (s390x) platforms
Deploying a model on an s390x cluster might require retraining
Limits on size of model deployments
Automatic mounting of storage volumes is not supported by online and batch deployments
Batch deployment jobs that use large inline payload might get stuck in starting or running state
Setting environment variables in a conda yaml file does not work for deployments
Jobs for batch deployments that use package extensions might fail

Known Issues

Known issues for AutoAI

Importing an AutoAI notebook from a catalog can result in runtime error

Applies to: 5.4

If you save an AutoAI notebook to an IBM Knowledge Catalog Standard, and then you import it into a project and run it, you might get this error: Library not compatible or missing.

This error results from a mismatch between the runtime environment saved in the catalog and the runtime environment required to run the notebook in the project. To resolve, update the runtime environment to the latest supported version. For example, if the imported notebook uses Runtime 23.1 in the catalog version, update to Runtime 24.1 and run the notebook job again.

Tip: When you update your runtime environment, check that you have adequate computing resources. The recommended configuration is at least 2 vCPU and 8GB RAM for an experiment notebook, and at least 4 vCPU and 16GB RAM for a pipeline notebook.

Known issues for Watson Machine Learning

Restoring Watson Machine Learning to a different namespace requires pod restarts

Applies to: 5.4

Case 1: offline restore:

After an offline restore of Watson Machine Learning to a different namespace, you must manually restart specific pods to ensure proper functionality.

Find the runtime-manager pod in the new namespace:

oc get pod -n <new-namespace> | grep runtime-manager

Example output:

runtime-manager-api-656bc6f499-29lzm                              1/1     Running     0             9h

Delete the pod to trigger a restart. The deployment controller will automatically create a new pod:
```
oc delete pod <runtime-manager-pod-name> -n <new-namespace>
```

Case 2: online restore:

If you're performing an online restore to a different namespace, you must:

Follow all the steps that are listed in Case 1: offline restore
Restart deployments that are labeled with both icpdsupport/addOnId=wml and icpdsupport/module=wml-scoring

To restart deployments that are labeled with both icpdsupport/addOnId=wml and icpdsupport/module=wml-scoring

Run this code to list all the deployments that are labeled with both icpdsupport/addOnId=wml and icpdsupport/module=wml-scoring:

oc get deploy -n <new-namespace> -l "icpdsupport/addOnId=wml,icpdsupport/module=wml-scoring"

Example output:

NAME                                                          READY   UP-TO-DATE   AVAILABLE   AGE
wml-dep-kb-rt251-a4f556b2-3bb1-4f27-97fe-511ddc2bde44         1/1     1            1           40h
wml-dep-mllib-29d53827-0489-4120-87e3-9de8ee5bfc56            1/1     1            1           40h
wml-dep-mllib-a93b4264-1f30-4f60-868b-667503824e48            1/1     1            1           40h
wml-dep-od-spss-4c50f145-7df9-431d-b758-8fc1407427f7          1/1     1            1           40h
wml-dep-od-spss-f9ed805a-5aa3-4734-8127-b4ce823b31e8          1/1     1            1           40h
wml-dep-py-rt251-e4f0ec7d-ff46-4845-b60f-a8dd914cc0ec         1/1     1            1           40h
wml-dep-py-rt251-multi-8c3d143a-54a9-45f7-9928-5c8aab9715b6   1/1     1            1           40h
wml-dep-ts-rt251-6034cbcd-b876-4465-b87f-36088046fcb3         1/1     1            1           40h

Delete all the corresponding pods to trigger a restart. The deployment controller will automatically create a new pod. In the example case, all pod names start with wml-dep. If your configuration is different, you must adjust the code accordingly:
```
oc get pods -n <new-namespace> --no-headers | grep "^wml-dep-" | awk '{print $1}' | xargs -r oc delete pod -n my-namespace
```

Additional consideration for original namespace:

If you intend to continue using the original namespace after performing an online restore to a different namespace, you must also restart deployments that are labeled with both icpdsupport/addOnId=wml and icpdsupport/module=wml-scoring in the original namespace. Deleting the pods will trigger the deployment controller to automatically create new pods.

After upgrading Watson Machine Learning some of the existing deployments might not be visible in UI

Applies to: 5.4

After upgrading Watson Machine Learning, some of the existing deployments might not be visible in UI. This issue appears if a deployed custom foundation model is associated with a custom hardware specification that has a colon (:) or a comma (,) in its name.

Workaround:

Delete the problematic deployments and the custom hardware specification that they're associated with. Then create a new custom hardware specification (without a colon or a commain its name) and create a new deployment that is associated with the new custom hardware specification.

List all the deployments and get:
- The ID of the problematic deployment
- The ID of its deployment space or project
- The ID of the custom hardware specification that the deployment is associated with
  Note: Make sure that the name of the custom hardware specification contains a colon.
```
curl -ivk -X GET -H "Authorization: Bearer $TOKEN" "<CPD-URL>/ml/v4/deployments?version=2020-10-10"
```

Delete the problematic deployment:

curl -ivk -X DELETE -H "Authorization: Bearer $TOKEN" "<CPD-URL>/ml/v4/deployments/<DEPLOYMENT-ID>?version=2020-10-10&space_id=<SPACE-ID>"

Delete the problematic custom hardware specification:

curl -ik -X DELETE "<CPD-URL>/v2/hardware_specifications/<CUSTOM-HARDWARE-SPECIFICATION-ID>" \
-H "Authorization: Bearer $TOKEN"

Create a new custom hardware specification (without a colon or a comma in its name) and then create a new deployment that is associated with the new custom hardware specification.

Upgrading Watson Machine Learning might fail because of runtime errors

Applies to: 5.4

When upgrading Watson Machine Learning from version 5.0.1 or 5.1.x to version 5.3.0 and later while on the s390x architecture, the upgrade might fail. This is because of a runtime error returning as undefined.

The following is an example of the error that can be seen in the Watson Machine Learning custom resource (CR):

Message: AnsibleUndefinedVariable: 'onnxruntime_opset_19_server_json' is undefined
The playbook has failed. See earlier output for exact error

Confirm the failure in the Watson Machine Learning CR

Run the following command to see the status of the wml-cr:

oc describe wmlbase wml-cr -n zen

The following is an example output:

   Message:               AnsibleUndefinedVariable: 'onnxruntime_opset_19_multi_server_json' is undefined
The playbook has failed. See earlier output for exact error
    Reason:                Failed
    Status:                True
    Type:                  Failure
    Last Transition Time:  2025-06-04T06:12:13Z
    Message:               Running reconciliation
    Reason:                Running
    Status:                True
    Type:                  Running
  Progress:                5%
  Progress Message:        Finished Pre-Configuration
  Reconcile History:
    The last reconciliation was completed successfully.
  Versions:
    Reconciled:  5.1.2
  Wml Status:    InProgress
Events:          <none>

Workaround:

To fix this issue, follow these steps:

Note: This workaround patch ensures the playbook includes all runtime JSON definitions needed during upgrade.

Enable maintenance mode on the Watson Machine Learning CR:

oc patch wmlbase wml-cr --type merge --patch '{"spec": {"ignoreForMaintenance": true}}' -n <cpd_instance_ns>

Identify the Watson Machine Learning operator pod:

oc get pods -n <cpd-operator-ns> | grep ibm-cpd-wml-operator

Take note of the playbook upgrade file:
- If you are upgrading from version 5.0.x, the upgrade playbook file will be called upgrade50-s390x.yml.
- If you are upgrading from version 5.1.x, the upgrade playbook file will be called upgrade51-s390x.yml.
Copy and edit the playbook upgrade file by entering the pod:
```
oc rsh -n <cpd-operator-ns> <wml-operator-pod-name>
```
While inside the pod, change directories to get to the playbook upgrade files:
```
cd /opt/ansible/5.3.0/roles/wml-base/tasks
```

Copy the playbook upgrade file to a location outside the pod:

oc cp -n <cpd-operator-ns> <wml-operator-pod-name>:/opt/ansible/5.3.0/roles/wml-base/tasks/upgrade50-s390x.yml /tmp/upgrade50-s390x.yml

Keep a backup of the playbook upgrade file:

mv upgrade50-s390x.yml upgrade50-s390x.yml.org

Edit the upgrade playbook file and update the list of json files under the section:

name: Load runtime definitions for {{ kind }}
include_role:
name: "{{ version }}{{ directory_prefix }}{{roles_dir}}common"
tasks_from: load_runtime_definition.yml
loop:

To add the following json files to the list:

- { file_name: "files/onnxruntime_opset_19-server.json", var_name: "onnxruntime_opset_19_server_json" }
- { file_name: "files/onnxruntime_opset_19-multi-server.json", var_name: "onnxruntime_opset_19_multi_server_json" }

Here is an example of the complete code block with the updated json list (showing lines 230-231 added):

- name: Load runtime definitions for {{ kind }}
include_role:
  name: "{{ version }}{{ directory_prefix }}{{roles_dir}}common"
  tasks_from: load_runtime_definition.yml
loop:
  - { file_name: "files/auto_ai.kb-server.json", var_name: "auto_ai_kb_server_json" }
  ...
  - { file_name: "files/onnxruntime_opset_19-server.json", var_name: "onnxruntime_opset_19_server_json" }
  - { file_name: "files/onnxruntime_opset_19-multi-server.json", var_name: "onnxruntime_opset_19_multi_server_json" }
  ...

Push the updated file back into the pod:

oc cp /tmp/upgrade50-s390x.yml <wml-operator-pod-name>:/opt/ansible/5.3.0/roles/wml-base/tasks/upgrade50-s390x_new.yml -n <cpd-operator-ns>

oc rsh -n <cpd-operator-ns> <wml-operator-pod-name>

cd /opt/ansible/5.3.0/roles/wml-base/tasks

Rename the new modified file to the original:

mv upgrade50-s390x_new.yml upgrade50-s390x.yml
chmod 777 upgrade50-s390x.yml
exit

Turn off maintenance mode:

oc patch wmlbase wml-cr --type merge --patch '{"spec": {"ignoreForMaintenance": false}}' -n <cpd_instance_ns>

Upgrading Watson Machine Learning might fail because of orphaned objects

Applies to: 5.4

Upgrading Watson Machine Learning, might fail because of Watson Machine Learning deployment objects which are orphaned.

If this issue is detected early on during the upgrade process, the pre-upgrade job will fail.

Confirm upgrade failure is due to orphaned objects

If you want to confirm that the upgrade failure is due to the orphaned objects, run the following steps:

Fetch the pod to confirm the job run has failed:

oc get pods -n <namespace> | grep wml-pre-upgrade-check

Check the pod output for indications of orphaned deployments:

oc logs <pod-name> | grep "Orphaned deployments found! Exiting with failure."

If orphaned objects are detected, you will receive a detailed report outlining the orphaned objects found in the Watson Machine Learning environment. For example:

2025/05/03 11:36:27,837|INFO|check_dep_orphans.py:250: Generating report.....
2025/05/03 11:36:27,841|INFO|check_dep_orphans.py:252: Below are the WML deployments orphaned objects found.
2025/05/03 11:36:27,843|INFO|check_dep_orphans.py:255: ------------------------------------------------
2025/05/03 11:36:27,844|INFO|check_dep_orphans.py:256:    space_id: 350d356e-fdf7-42da-9f6d-71a72ee77221
2025/05/03 11:36:27,845|INFO|check_dep_orphans.py:257: ------------------------------------------------
2025/05/03 11:36:27,846|INFO|check_dep_orphans.py:260: Deployments:
2025/05/03 11:36:27,849|INFO|check_dep_orphans.py:264:  e854d101-1c55-46c1-bb89-3589c27395ac {'name': 'base', 'sw_spec_id': '121c6a74-4ed4-5828-8b81-56ef47f3bc2f', 'type': 'base', 'model_id': 'fcb4c4d0-3587-4069-b58c-6fcec002384d'}
2025/05/03 11:36:27,850|INFO|check_dep_orphans.py:266: Models:
2025/05/03 11:36:27,851|INFO|check_dep_orphans.py:271:  f5742786-4546-43c5-853f-13758d04ee0a
2025/05/03 11:36:27,854|INFO|check_dep_orphans.py:273: Derived software specification:
2025/05/03 11:36:27,855|INFO|check_dep_orphans.py:278:  7274f418-19c9-4d9c-b13f-ba9b1411fc79 {'name': 'derived1_cw_spec-upgrade', 'base_sw_spec_id': '121c6a74-4ed4-5828-8b81-56ef47f3bc2f'}
2025/05/03 11:36:27,856|INFO|check_dep_orphans.py:280: Missing software specification:
2025/05/03 11:36:27,858|INFO|check_dep_orphans.py:285:  121c6a74-4ed4-5828-8b81-56ef47f3bc2f
27,896|ERROR|check_dep_orphans.py:448: Orphaned deployments found! Exiting with failure.
Error: /opt/ibm/scripts/check_dep_orphans.pyc rc=1

Review the list of Watson Machine Learning deployment objects in the log report.

Workaround to clear orphaned objects

To clear the orphaned objects, run the following steps:

You can delete orphaned objects directly by patching the Watson Machine Learning custom resource (CR) with the delete_orphan field:
```
oc patch wmlbase wml-cr -n <namespace> --type=merge -p '{"spec":{"delete_orphan":true}}'
```
After running the patch command, the Watson Machine Learning operator will reconcile and the job will re-run.

When the job run is complete, you will see this message:

oc logs <pod-name> | grep "Orphaned deployments deleted successfully. Exiting with success."

For example:

[root@api.xxxxxx.cp.fyre.ibm.com ~]# oc logs wml-pre-upgrade-check-kqpqs | grep "Orphaned deployments deleted successfully. Exiting with success."
2025/05/03 11:49:33,713|INFO|check_dep_orphans.py:445: Orphaned deployments deleted successfully. Exiting with success.

Decision Optimization deployment job fails with error: "Add deployment failed with deployment not finished within time"

Applies to: 5.4

If your decision optimization deployment job fails with the following error, complete the steps to extend the timeout window.

"status": {
     "completed_at": "2022-09-02T02:35:31.711Z",
     "failure": {
         "trace": "0c4c4308935a3c4f2d9987b22139c61c",
         "errors": [{
              "code": "add_deployment_failed_in_runtime",
              "message": "Add deployment failed with deployment not finished within time"
         }]
     },
     "state": "failed"
   }

To update the deployment timeout in the deployment manager:

Edit the wmlbase wml-cr and add this line: ignoreForMaintenance: true. This sets the WML operator into maintenance mode, which stops automatic reconciliation. The automatic reconciliation will undo any configmap changes applied otherwise.
```
oc patch wmlbase wml-cr --type merge --patch '{"spec": {"ignoreForMaintenance": true}}' -n <namespace>
```

Capture the contents of the wmlruntimemanager configmap in a YAML file.

oc get cm wmlruntimemanager -n <namespace> -o yaml > wmlruntimemanager.yaml

Create a backup of the wmlruntimemanager YAML file.

cp wmlruntimemanager.yaml wmlruntimemanager.yaml.bkp

Open the wmlruntimemanager.yaml.
Navigate to file runtimeManager.conf and search for property service.
Increase the number of retries in the retry_count field to extend the timeout window:
```
service {

        jobs {

            do {
                check_deployment_status {
                    retry_count = 420   // Increase the number of retries to extend the timeout window }
                    retry_delay = 1000
                }
            }
        }
```
Where:
- Field retry_count is the number of retries
- Field retry_delay is the delay between each retry in milliseconds
In the example, the timeout is configured as 7 minutes (retry_count * retry_delay = 420 * 1000 = 7 minutes). If you want to increase the timeout further, you can increase the number of retries in the retry_count field.

Apply the deployment manager configmap changes:

oc delete -f wmlruntimemanager.yaml
oc create -f wmlruntimemanager.yaml

Restart the deployment manager pods:

oc get pods -n <namespace> | grep wml-deployment-manager

oc delete pod <podname> -n <namespace>

Wait for the deployment manager pod to come up:

oc get pods -n <namespace> | grep wml-deployment-manager

Note: If you plan to upgrade the Cloud Pak for Data cluster, you must bring the WML operator out of maintenance mode by setting the field ignoreForMaintenance to false in wml-cr.

Configuring runtime definition for a specific GPU node fails

Applies to: 5.4

When you configure the runtime definition to use a specific GPU node with the nodeaffinity property, the runtime definition fails.

As a workaround, you must enable the MIG configuration for all GPU nodes if MIG is enabled for even a single GPU node. You must also use the Single profile type for all the GPU nodes. Mixed profiling is not supported. To learn more about single and mixed profiling strategies, see NVIDIA documentation.

A certification path error appears when users inference online deployments by using Java

Applies to: 5.4

A certification path error might appear when you inference online deployments by using Java. See an example error:

The URL is not valid.
PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target

Workaround:

Open a web browser, paste the URL of the prediction endpoint into the address bar and then press Enter.
Access security information from the address bar. In most browsers, you must click on the padlock icon. Then open certificate details.
Export the certificate (for example, as cpd-cert.crt).
Open your system's terminal or command-line tool and then set these environment variables:
- $FILE_WITH_CERTS: use name of the file that contains the exported certificates
- $CERT_ALIAS: create an alias for the new certificate

Import the certificates by using keytool. See example code for Linux:

keytool -import -trustcacerts -alias $CERT_ALIAS -file $FILE_WITH_CERTS -keystore $JAVA_HOME/lib/security/cacerts -storepass changeit

Verify that the certificate is properly added. See example Linux code:
```
keytool -list -v -keystore $JAVA_HOME/lib/security/cacerts
```
Compile and execute your code.

Deep Learning experiment fails to save trained models with "Asset not found" error

Applies to: 5.4

When you complete a Deep Learning experiment and attempt to save the trained model using the Save model action from the training job row, the UI throws an "Asset not found" error and the model is not saved. The trained model exists in the backend after successful training, but cannot be saved through the UI.

Cause: This issue occurs because the request.json file is not created in the expected location (/assets/{experimentId}/resources/wml_model/request.json) during the training process. When the UI attempts to save the model, it cannot locate this required file, which results in a 404 error from the asset-files-api.

The issue affects both PyTorch and TensorFlow models that were trained with Runtime 25.1 (pytorch-onnx_rt25.1-py3.12 and tensorflow-onnx_rt25.1-py3.12).

Workaround:

To perform inference with the trained model, complete the following steps:

Get the experiment ID and training ID:
1. Go to the Experiment Overview tab and locate the path to the directory that contains experiment results. For example:
```
"location": {
  "path": "/projects/2bebafa2-788e-4f1d-889f-82690d31c5a9/assets/experiment/019e229b-b3fa-73f4-a4f7-f560abf9d8bb/trainings"
}
```
2. Click on the Training Runs tab and then click the training link.
3. Click the Overview tab and find the training ID. For example: ffdb902a-0425-4de1-aa7c-1df395cb183e
Ask your project administrator to download the model file. Refer to these download steps:
1. Find the asset-files-api-xxxx pod in the Zen service controller namespace (for example, cpd-instance).
2. Download the zip file that contains the trained model. For example:
```
oc cp asset-files-api-xxxx:/mnt/asset_file_api/projects/2bebafa2-788e-4f1d-889f-82690d31c5a9/assets/experiment/019e229b-b3fa-73f4-a4f7-f560abf9d8bb/trainings/ffdb902a-0425-4de1-aa7c-1df395cb183e/data/model/ffdb902a-0425-4de1-aa7c-1df395cb183e.zip .
```
  You can also retrieve the file directly from your storage server that is used for service deployment.
Import the model into a project or a space.
If you imported the model to a project, promote the model to a deployment space.
Create an online or batch deployment by using the model in the deployment space.
Score the deployment.

Limitations

Limitations for AutoAI experiments

AutoAI file gets pushed to Git repository in default Git projects: After you create an AutoAI experiment in a default Git project, you create a commit and see a file that includes your experiment name in the list of files that can be committed. There are no consequences to including this file in your commit. The AutoAI experiment will not appear in the asset list for any other user who pulls the file into their local clone using Git. Additionally, other users won't be prevented from creating an AutoAI experiment with the same name.
Maximum number of feature columns in AutoAI experiments: The maximum number of feature columns for a classification or regression experiment is 5000.

Limitations for Watson Machine Learning

Deep Learning experiments with storage volumes in a Git enterprise project are not supported

If you create a Git project with assets in storage volumes, then create a Deep Learning experiment, running the experiment fails. This use case is not currently supported.

Deep Learning jobs are not supported on IBM Power (ppc64le) or Z (s390x) platforms

If you submit a Deep Learning training job on IBM® Power® (ppc64le) or Z (s390x) platform, the job fails with an InvalidImageName error. This is the expected behavior as Deep Learning jobs are not supported on IBM Power (ppc64le) or Z (s390x) platforms.

Deploying a model on an s390x cluster might require retraining

If you trained an AI model on a platform different than s390x, such as x86/ppc, and then you try to deploy the model on the s390x platform, such a deployment might fail and report an endianness issue: Argument shape does not agree with the input data. This happens if an older version of Pytorch (older than 2.1.2) was used to train the model (runtimes older than 24.1). To resolve the problem:

Retrain the model by using a runtime that contains a newer version of Pytorch on the x86/ppc platform and then deploy the model on the s390x platform
Retrain the AI model on the s390x platform and then deploy the model on the s390x platform

Limits on size of model deployments

Limits on the size of models you deploy with Watson Machine Learning depend on factors such as the model framework and type. In some instances, when you exceed a threshold, you will be notified with an error when you try to store a model in the Watson Machine Learning repository, for example: OverflowError: string longer than 2147483647 bytes. In other cases, the failure might be indicated by a more general error message, such as The service is experiencing some downstream errors, please re-try the request or There's no available attachment for the targeted asset. Any of these results indicate that you have exceeded the allowable size limits for that type of deployment.

Automatic mounting of storage volumes is not supported by online and batch deployments

You cannot use automatic mounts for storage volumes with Watson Machine Learning online and batch deployments. Watson Machine Learning does not support this feature for Python-based runtimes, including R script, SPSS Modeler, Spark, and Decision Optimization. You can use only automatic mounts for storage volumes with Watson Machine Learning Shiny app deployments and notebook runtimes.

As a workaround, you can use the download method from the Data assets library, which is a part of the ibm-watson-machine-learning Python client.

Batch deployment jobs that use large inline payload might get stuck in starting or running state

If you provide a large asynchronous payload for your inline batch deployment, it can result in the runtime manager process to go out of heap memory.

In the following example, 92 MB of payload was passed inline to the batch deployment which resulted in the heap to go out of memory.

Uncaught error from thread [scoring-runtime-manager-akka.scoring-jobs-dispatcher-35] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled for ActorSystem[scoring-runtime-manager]
java.lang.OutOfMemoryError: Java heap space
   at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
   at java.base/java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:172)
   at java.base/java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:538)
   at java.base/java.lang.StringBuilder.append(StringBuilder.java:174)
   ...

This could result in concurrent jobs getting stuck in starting or running state. The starting state can only be cleared once the deployment is deleted and a new deployment is created. The running state can be cleared without deleting the deployment.

As a workaround, use data references instead of inline for huge payloads that are provided to batch deployments.

Setting environment variables in a conda yaml file does not work for deployments

Setting environment variables in a conda yaml file does not work for deployments. This means that you cannot override existing environment variables, for example LD_LIBRARY_PATH, when deploying assets in Watson Machine Learning.

Example:

variables:
  my_var: my_value

In this code, my_value will not be effective in a deployment's environment.

Workaround for online deployments:

If you're using a Python function, consider setting default parameters.

Workaround for batch jobs:

For Python functions and Python scripts, if you're running batch jobs, use scoring.environment_variables in job's payload.

Example code that creates a batch deployment for a Python function by using the ibm-watsonx-ai SDK:

scoring_payload = {
    'input_data': [{
        'values': [[0]]
    }],
    "environment_variables" : {
        "my_var": "my_value",
    }
}
client.deployments.create_job(deployment_id, scoring_payload)

Jobs for batch deployments that use package extensions might fail

Jobs for batch deployments that use package extensions might fail with this error: WMLClientError: The product version <version> is not supported yet. This happens when a package extension downgrades ibm-watsonx-ai.