Configuring GPU features and models
Use GPU features to boost your watsonx Assistant's efficiency through accelerated processing, handling complex models, real-time inference, and large-scale data handling. This improves the efficiency and accuracy of AI models. Usage of the GPU features is optional.
- Permissions you need for these tasks:
- You must be an administrator of the Red Hat® OpenShift® project to manage the cluster.
Enabling or disabling GPU features
oc patch wa wa --type=merge -p="{\"configOverrides\": {\"enabled_components\": {\"store\": {\"ifm\": true}}, \"watsonx_enabled\": true }}"
To disable the GPU features, use the following
command:oc patch wa wa --type=merge -p="{\"configOverrides\": {\"enabled_components\": {\"store\": {\"ifm\": false}}, \"watsonx_enabled\": false }}"
Supported foundation models for GPU features
5.1.2 and later- Specialized model in watsonx Assistant
- ibm-granite-8b-unified-api-model-v2
Out of the Boxmodels-
- granite-3-8b-instruct
- llama-3-1-70b-instruct
| Model name | Requires additional model during installation |
Conversational search
Query rewrite |
Conversational search
Answer generation |
Conversational skills
Custom actions Information gathering |
|---|---|---|---|---|
| granite-3-8b-instruct | Yes. One of the following models:
|
No | Yes | No |
| ibm-granite-8b-unified-api-model-v2 | Yes. One of the following models:
|
Yes | No | Yes |
| llama-3-1-70b-instruct | No | Yes | Yes | Yes |
System requirements
5.1.2 and laterThe following table lists the recommended number of GPUs to configure on a single OpenShift worker node that are provided with watsonx Assistant at the default context window length.
Specialized model in watsonx Assistant
| Model name | Description | System requirements | Supported GPU |
|---|---|---|---|
|
Granite models are used for a wide range of generative and nongenerative tasks with appropriate prompt engineering. They employ a GPT-style decoder-only architecture, with more innovations from IBM Research and the open community. |
|
|
For details on Out of the Box models and their system
requirements, see Foundation models.
Enabling a specific model
5.1.2 and laterWhen you enable GPU
features, ibm-granite-8b-unified-api-model-v2 and
granite-3-8b-instruct models are installed automatically. If you want to install
the model of your choice, use the following commands.
Out of the
Box model without modifying replicas or
shards:oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"ootb":{"<model-name>":{}}}}}}'
To enable the Specialized model without modifying replicas or shards:
oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"syom":{"ibm-granite-8b-unified-api-model-v2":{}}}}}}'If
you want to change the model after installation, you must restart the store deployment to pick up
the newly installed
models.oc rollout restart deployment wa-storeDisabling the Out of the Box
model
5.1.2 and laterOut of the Box model, do the following steps:- Set the model's name that you want to remove in
model_name.export MODEL_NAME="<model-name>" - Remove the entry from watsonx Assistant custom
resource.
oc patch wa wa --type json --patch "[{ "op": "remove", "path": "/configOverrides/ifm/model_config/ootb/$MODEL_NAME" }]" - Remove the entry from watsonx.ai™ IFM
custom
resource.
oc get watsonxaiifm watsonxaiifm-cr -o json | jq ".spec.install_model_list -= [\"${MODEL_NAME}\"]" | oc apply -f - - Remove the
InferenceServiceresource for the model.oc delete isvc ${MODEL_NAME}
Disabling the specialized model
5.1.2 and lateroc patch wa wa --type json --patch '[{ "op": "remove", "path": "/configOverrides/ifm/model_config/syom" }]'Adjusting Replicas for a
model
5.1.2 and laterYou can start extra model replicas to handle the increased load.
oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"syom":{"ibm-granite-8b-unified-api-model-v2":{"replicas": <replica-value>}}}}}}'
To enable and adjust replicas for the Out of the Box model, use the following
command:oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"ootb":{"<model-name>":{"replicas": <replica-value>}}}}}}'Adjusting Shards for Out of the
Box model
5.1.2 and laterOut of the Box model, use the following
command:oc patch wa wa --type='merge' -p='{"configOverrides":{"ifm":{"model_config":{"ootb":{"<model-name>":{"shards": <shard-value>}}}}}}'