Properties and parameters for custom foundation models

You can set and adjust the parameters of your custom foundation model to define its behavior.

Service The required watsonx.ai service and other supplemental services are not available by default. An administrator must install these services on the IBM Cloud Pak for Data platform. To determine whether a service is installed, open the Services catalog and check whether the service is enabled.

Deploying custom foundation model is available starting with Cloud Pak for Data 4.8.4.

Model parameters

You must enter the following details when you register your custom foundation model:

Model parameters
Field Type Required or optional Description
model_id String Required Specify the ID of the custom foundation model.
location Object Required Specify the location of the custom foundation model. See Location properties.
tags String Optional Provide additional metadata about the model.
parameters Object Optional Specify the parameters of the model. See Global parameters for custom foundation models

Location properties

You can use the following parameters to describe the location of your deployed custom foundation model:

Location properties
Location Type Required or optional Description
pvc_name String Required Use this parameter to specify the Persistent Volume Claim (PVC) where your custom foundation model is stored.
sub_path String Optional Use this parameter to specify the subpath of the model within the PVC.

Global parameters for custom foundation models

You can use the following global parameters to deploy your custom foundation models:

Global parameters for custom foundation models
Parameter Type Range of values Default value Description
dtype String float16, bfloat16 float16 Use this parameter to specify the data type for your model.
max_batch_size Number max_batch_size >= 1 256 Use this parameter to specify the maximum batch size for your model.
max_concurrent_requests Number max_concurrent_requests >= 1 and max_concurrent_requests >= max_batch_size 1024 Use this parameter to specify the maximum number of concurrent requests that can be made to your model.
max_new_tokens Number max_new_tokens >= 20 2048 Use this parameter to specify the maximum number of tokens that can be generated by your model for an inference request.
max_sequence_length Number max_sequence_length >= 20 and max_sequence_length >= max_new_tokens 2048 Use this parameter to specify the maximum sequence length for your model.
Important: If the values that are set for the model are not within the specified ranges, your deployment will fail and inferencing the model will not be possible.

For detailed parameter descriptions, see Properties for global parameters for custom foundation models.

Properties for global parameters for custom foundation models

You can use the following properties for the global parameters for custom foundation models:

Properties for global parameters for custom foundation models
Property Type Required or optional Description
name String Required Use this property to specify the name of the parameter.
default String, number, boolean Required Use this property to specify the default value of the parameter.
min Number Optional Use this property to specify the minimum value of the paratemer. The min value must be less than or equal to the entered value.
max Number Optional Use this property to specify the maximum value of the parameter. The max value must be greater than or equal to the entered value.
options String, number Optional Use this property to specify a list of options to choose for the parameter. The type of options value must be the same as parameter value. The selected value must be from within the options list.
Note:
  • If you set default model parameters at the model registration phase, you can then override them at the creation phase and during an update.
  • If you don't set default parameters during the model registration phase, watsonx sets the default parameters at the creation phase. You can then override them during an update.

For example:

    apiVersion: watsonxaiifm.cpd.ibm.com/v1beta1
    kind: Watsonxaiifm
    metadata:
    name: watsonxaiifm-cr
    ......
    spec:
    ignoreForMaintenance: false
    .......
    custom_foundation_models:
    - model_id: example_model_70b
        location:
          pvc_name: example_model_pvc
        tags:
        - example_model
        - 70b
        parameters:
        - name: dtype
            default: float16
            options:
            - float16
            - bfloat16
        - name: max_batch_size
            default: 256
            min: 16
            max: 512
        - name: max_concurrent_requests
            default: 64
            min: 0
            max: 128
        - name: max_sequence_length
            default: 2048
            min: 256
            max: 8192
        - name: max_new_tokens
            default: 2048
            min: 512
            max: 4096
    - model_id: example_model_13b
        location:
          pvc_name: example_model_pvc_13b

Next steps

Creating a deployment for a custom foundation model

Parent topic: Deploying custom foundation models