Properties and parameters for custom foundation models

You can set and adjust the parameters of your custom foundation model to define its behavior.

Model parameters

You must enter the following details when you register your custom foundation model:

Field Type Required or optional Description
model_id String Required Specify the ID of the custom foundation model.
location Object Required Specify the location of the custom foundation model. See Location properties.
tags String Optional Provide additional metadata about the model.
parameters Object Optional Specify the parameters of the model. See Global parameters for custom foundation models
functions String Specify the functions of a model. For example: image_chat, audio_chat embedding, or rerank. You must first verify the available functions in the model card.
If the functions field is not specified, the model defaults to text generation and text chat (if a chat template is available):
  • If the model does not include a chat template, the default task is text generation.
  • If the model includes a chat template, the default tasks are: text generation and text chat.

Location properties

You can use the following parameters to describe the location of your deployed custom foundation model:

Location Type Required or optional Description
pvc_name String Required Use this parameter to specify the Persistent Volume Claim (PVC) where your custom foundation model is stored.
sub_path String Optional Use this parameter to specify the subpath of the model within the PVC.

Global parameters for custom foundation models

Important:
  • Time series models do not take any parameters. Do not provide any global parameters when you are setting up or deploying a custom time series model.
  • Models that use a custom inference runtime image don't accept parameters at deployment creation stage. You must set these parameters either when you create the runtime definition, or during model registration.
  • You must set the value of your base model parameters within the range that is specified in the following table. If you don't do that, your deployment might fail and inferencing will not be possible. If the default values for your model parameters result in an error, modify the model's registry in the watsonxaiifm CR.

You can use the following global parameters for your custom foundation models:

Table 1. Global parameters for all custom foundation models
Parameter Type Range of values Default value Description
max_num_seqs Number max_num_seqs >= 1 16 Specifies the maximum number of sequences (requests) that are processed in parallel during inference. Higher values increase throughput but require more KV cache memory.
max_model_length Number max_model_length >= 20; max_model_length <= model_context_length x max_num_seqs <= available KV cache memory 2048 Specifies the maximum total number of tokens (input + output) per sequence. Must be within the model's context length and chosen based on the value of max_num_seqs. Both of these parameters affect KV cache memory usage.

These optional parameters apply only to models that have a chat API and use the vLLM runtime engine.

Table 2. Global parameters that apply only to models that have a chat API
Parameter Type Range of values Default value Description
tool_call_parser String Name of the tool parser that matches the model N/A Enables automatic selection from a list of tools that are provided by user at inference phase. You can find the list of available parsers in vLLM documentation
chat_template String Name of the template file N/A Overrides the standard chat template that is provided with the model. For more information, see Setting up storage and uploading the model.

From release 5.2.2, to ensure lower token consumption and increased inferencing speed in repeated inference scenarios, models that use the vLLM runtime engine have prefix caching set to true by default. If your use case is different or you're experiencing issues such as high cache usage and OOM (out of memory) errors, add the enable_prefix_caching parameter to your model parameters and set its value to false.

Properties for global parameters for custom foundation models

You can use the following properties for the global parameters for custom foundation models:

Table 3. Properties for global parameters for custom foundation models
Property Type Required or optional Description
name String Required Use this property to specify the name of the parameter.
default String, number, boolean Required Use this property to specify the default value of the parameter.
min Number Optional Use this property to specify the minimum value of the parameter. The min value must be less than or equal to the entered value.
max Number Optional Use this property to specify the maximum value of the parameter. The max value must be greater than or equal to the entered value.
options String, number Optional Use this property to specify a list of options to choose for the parameter. The type of options value must be the same as parameter value. The selected value must be from within the options list.
Important:
  • For models that use standard inference runtimes:
    • If you don't set default parameters during the model registration phase, the default parameters are set automatically at the deployment creation phase. You can then override them during an update.
    • If you set default model parameters at the model registration phase, you can then override them at the creation phase and during an update.
    • Time-series models do not take any parameters. Do not provide any parameters when you are deploying a custom time-series model. If you provide parameters when you deploy a custom time-series model, they will have no effect.
  • Models that use a custom inference runtime image ignore parameters that are set at deployment creation stage. You must set these parameters either when you create the runtime definition, or during model registration. Also, the list of accepted parameters might be different from the list of parameters that are used by models that use standard inference runtimes.