Properties and parameters for custom foundation models
You can set and adjust the parameters of your custom foundation model to define its behavior.
Service The required watsonx.ai service and other supplemental services are not available by default. An administrator must install these services on the IBM Cloud Pak for Data platform. To determine whether a service is installed, open the Services catalog and check whether the service is enabled.
Deploying custom foundation model is available starting with Cloud Pak for Data 4.8.4.
Model parameters
You must enter the following details when you register your custom foundation model:
Field | Type | Required or optional | Description |
---|---|---|---|
model_id |
String | Required | Specify the ID of the custom foundation model. |
location |
Object | Required | Specify the location of the custom foundation model. See Location properties. |
tags |
String | Optional | Provide additional metadata about the model. |
parameters |
Object | Optional | Specify the parameters of the model. See Global parameters for custom foundation models |
Location properties
You can use the following parameters to describe the location of your deployed custom foundation model:
Location | Type | Required or optional | Description |
---|---|---|---|
pvc_name |
String | Required | Use this parameter to specify the Persistent Volume Claim (PVC) where your custom foundation model is stored. |
sub_path |
String | Optional | Use this parameter to specify the subpath of the model within the PVC. |
Global parameters for custom foundation models
You can use the following global parameters to deploy your custom foundation models:
Parameter | Type | Range of values | Default value | Description |
---|---|---|---|---|
dtype |
String | float16 , bfloat16 |
float16 |
Use this parameter to specify the data type for your model. |
max_batch_size |
Number | max_batch_size >= 1 |
256 | Use this parameter to specify the maximum batch size for your model. |
max_concurrent_requests |
Number | max_concurrent_requests >= 1 and max_concurrent_requests >= max_batch_size |
1024 | Use this parameter to specify the maximum number of concurrent requests that can be made to your model. |
max_new_tokens |
Number | max_new_tokens >= 20 |
2048 | Use this parameter to specify the maximum number of tokens that can be generated by your model for an inference request. |
max_sequence_length |
Number | max_sequence_length >= 20 and max_sequence_length >= max_new_tokens |
2048 | Use this parameter to specify the maximum sequence length for your model. |
For detailed parameter descriptions, see Properties for global parameters for custom foundation models.
Properties for global parameters for custom foundation models
You can use the following properties for the global parameters for custom foundation models:
Property | Type | Required or optional | Description |
---|---|---|---|
name |
String | Required | Use this property to specify the name of the parameter. |
default |
String, number, boolean | Required | Use this property to specify the default value of the parameter. |
min |
Number | Optional | Use this property to specify the minimum value of the paratemer. The min value must be less than or equal to the entered value. |
max |
Number | Optional | Use this property to specify the maximum value of the parameter. The max value must be greater than or equal to the entered value. |
options |
String, number | Optional | Use this property to specify a list of options to choose for the parameter. The type of options value must be the same as parameter value. The selected value must be from within the options list. |
- If you set default model parameters at the model registration phase, you can then override them at the creation phase and during an update.
- If you don't set default parameters during the model registration phase, watsonx sets the default parameters at the creation phase. You can then override them during an update.
For example:
apiVersion: watsonxaiifm.cpd.ibm.com/v1beta1
kind: Watsonxaiifm
metadata:
name: watsonxaiifm-cr
......
spec:
ignoreForMaintenance: false
.......
custom_foundation_models:
- model_id: example_model_70b
location:
pvc_name: example_model_pvc
tags:
- example_model
- 70b
parameters:
- name: dtype
default: float16
options:
- float16
- bfloat16
- name: max_batch_size
default: 256
min: 16
max: 512
- name: max_concurrent_requests
default: 64
min: 0
max: 128
- name: max_sequence_length
default: 2048
min: 256
max: 8192
- name: max_new_tokens
default: 2048
min: 512
max: 4096
- model_id: example_model_13b
location:
pvc_name: example_model_pvc_13b
Next steps
Creating a deployment for a custom foundation model
Parent topic: Deploying custom foundation models