Configuring GPU features and models

Use GPU features to boost your watsonx Assistant's efficiency through accelerated processing, handling complex models, real-time inference, and large-scale data handling. This improves the efficiency and accuracy of AI models. Usage of the GPU features is optional.

Permissions you need for these tasks:: You must be an administrator of the Red Hat® OpenShift® project to manage the cluster.

Complete the following tasks to configure the GPU features and supported models:

Enabling or disabling IFM
Supported foundation models for GPU features
Enabling a specific model
Disabling the Out of the Box model
Disabling the specialized models
Adjusting Shards for Out of the Box model

Enabling or disabling IFM

To enable the IFM, use the following command:

oc patch wa wa --type=merge -p="{\"configOverrides\": {\"enabled_components\": {\"store\": {\"ifm\": true}}, \"watsonx_enabled\": true }}"

To disable the IFM, use the following command:

oc patch wa wa --type=merge -p="{\"configOverrides\": {\"enabled_components\": {\"store\": {\"ifm\": false}}, \"watsonx_enabled\": false }}"

Supported foundation models for GPU features

Note: Supported foundation models are available only when the IFM is enabled.

GPU features support the following foundation models during installation:

Supported Out of the Box model

openai/gpt-oss-120b

Deprecated models

The following models are currently supported but scheduled to be withdrawn in a future release.

ibm/ibm-granite-8b-unified-api-model-v2
ibm/granite-3-8b-instruct
meta-llama/llama-3-1-70b-instruct
meta-llama/llama-3-3-70b-instruct

For more information, see GPU requirements for models.

Enabling a specific model

Note: If the available GPU memory is inadequate for the installation of a new model, ensure to disable any unused models to clear the space. For more information, see Disabling the Out of the Box model or Disabling the specialized model.

To enable the Out of the Box model without modifying replicas or shards:

oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"ootb":{"gpt-oss-120b":{}}}}}}'

Disabling the `Out of the Box` model

To disable the Out of the Box model, do the following steps:

Set the model's name that you want to remove in model_name.
```
export MODEL_NAME="<model-name>"
```

Remove the entry from watsonx Assistant custom resource.

oc patch wa wa --type json --patch "[{ "op": "remove", "path": "/configOverrides/ifm/model_config/ootb/$MODEL_NAME" }]"

Remove the entry from watsonx.ai™ IFM custom resource.

oc get watsonxaiifm watsonxaiifm-cr -o json | jq ".spec.install_model_list -= [\"${MODEL_NAME}\"]" | oc apply -f -

Remove the InferenceService resource for the model.
```
oc delete isvc ${MODEL_NAME}
```

Disabling the specialized model

To disable the specialized model, run the following steps:

Clean up the configuration overrides in custom resource.

oc patch wa wa --type json --patch '[{ "op": "remove", "path": "/configOverrides/ifm/model_config/syom" }]'

Remove the syom configuration from the watsonassistantstore resource.


oc patch watsonassistantstore wa --type json --patch '[{ "op": "remove", "path": "/configOverrides/ifm/model_config/syom" }]'

Delete the configmap that is associated with ibm-granite-8b-unified-api-model-v2.
```
oc delete configmap ibm-granite-8b-unified-api-model-v2
```
Wait for approximately 5 to 10 minutes for the changes to reconcile.

Adjusting `Shards` for `Out of the Box` model

To enable and adjust shards for Out of the Box model, use the following command:

oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"ootb":{"<model-name>":{"shards": <shard-value>}}}}}}'