Deploying custom foundation models
You can upload and deploy a custom foundation model for use with watsonx.ai inferencing capabilities.
In addition to working with foundation models that are curated by IBM, you can now upload and deploy your own foundation models. After the models are deployed and registered with watsonx.ai, create prompts that inference the custom models from the Prompt Lab.
Deploying a custom foundation model provides the flexibility for you to implement the AI solutions that are right for your use case. The deployment process differs slightly depending on the source of your custom foundation model.
It is best to get the model directly from the model builder. One place to find new models is Hugging Face, a repository for open source foundation models used by many model builders.
Watch this video to see how to deploy a custom foundation model.
This video provides a visual method to learn the concepts and tasks in this documentation.
Process for deploying a custom foundation model
Before you can deploy your custom foundation model, the System administrator must prepare the model and upload it to PVC storage. After storing the model, the admin must register the model with watsonx.ai.
For details about adding custom foundation models:
- watsonx.ai full service installation
- See Deploying custom foundation models in the IBM Software Hub documentation.
- watsonx.ai lightweight engine installation
- See Adding custom foundation models to watsonx.ai lightweight engine in the IBM Software Hub documentation..
To deploy your custom foundation model, consider the type of cluster on which you want to deploy your custom foundation model. The requirements for deployment differ based on whether you are deploying your custom foundation model on a GPU-enabled cluster or an MIG-enabled cluster. After deploying your custom foundation model, you can manage your deployment to update details, scale the number of copies, and more. The deployed model can then be used for prompting.