Deploying custom foundation models
The "Bring Your Own Model" feature enables you to upload and deploy a custom foundation model for use with watsonx.ai inferencing capabilities.
Service The required watsonx.ai service and other supplemental services are not available by default. An administrator must install these services on the IBM Cloud Pak for Data platform. To determine whether a service is installed, open the Services catalog and check whether the service is enabled.
Deploying custom foundation model is available starting with Cloud Pak for Data 4.8.4.
In addition to working with foundation models that are curated by IBM, you can now upload and deploy your own foundation models. After the models are deployed and registered with watsonx.ai, create prompts that inference the custom models from the Prompt Lab.
Deploying a custom foundation model provides the flexibility for you to implement the AI solutions that are right for your use case. The deployment process differs slightly depending on the source of your custom foundation model.
It is best to get the model directly from the model builder. One place to find new models is Hugging Face, a repository for open source foundation models used by many model builders.
Deployment overview
The process for deploying a foundation model and making it available for inferencing includes tasks that are performed by a Cloud Pak for Data administrator and tasks that are performed by a watsonx.ai user.
Admin tasks
These tasks must be completed by a Cloud Pak for Data administrator:
Watsonx.ai user tasks
These tasks can be completed by a watsonx.ai user, for example a model ops engineer or a prompt engineer.
Requirements and usage notes for custom foundation models
Deployable custom models must meet these requirements:
- The file list for the model must contain a config.json file. See Planning to deploy a custom foundation model for steps on how to check for the file.
- The model must be compatible with the Text Generation Inference (TGI) standard and be built with a supported model architecture type. The model type is listed in the
config.json file
for the model. - The model must be in a
safetensors
format and include a tokenizer for authentication. If the model is otherwise compatible, a conversion utility provides these requirements as part of the process for preparing to upload the model.
Note these restrictions for using custom foundation models after they are deployed and registered with watsonx.ai:
- You cannot tune a custom foundation model.
- You cannot use watsonx.governance to evaluate or track a prompt template for a custom foundation model.
Next steps
-
Watch this video to see how to deploy a custom foundation model.
This video provides a visual method to learn the concepts and tasks in this documentation.
Learn more
Developing generative AI solutions with foundation models (watsonx.ai)
Parent topic: Deploying foundation model assets