Foundation model deployment methods in watsonx.ai
You can choose from a collection of third-party and IBM foundation models to inference in IBM watsonx.ai. Find foundation models that best suit the needs of your generative AI application and your budget.
You can host foundation models in watsonx.ai in various ways.
Based on how foundation models are hosted in watsonx.ai, they are categorized as:
- Provided foundation models
- Custom foundation models
- Prompt-tuned foundation models
- Fine-tuned foundation models
Deployment methods comparison
To help you choose the deployment method that best fits your use case, review the comparison table.
| Deployment type | Deployment mechanism | Deprecation policy |
|---|---|---|
| Provided foundation models | Curated and provided by IBM | Deprecated according to published lifecycle. See Foundation model lifecycle. |
| Custom foundation models | Curated and deployed by you | Not deprecated |
| Prompt-tuned foundation models | Curated and provided by IBM; tuned by you | Deprecated when the underlying model is deprecated unless you add the underlying model as a custom foundation model |
| Fine-tuned foundation models | Tuned and deployed by you | Not deprecated |
Provided foundation models that are ready to use
You can choose to deploy foundation models in watsonx.ai from a collection of third-party and IBM models curated by IBM. You can prompt these foundation models in the Prompt Lab or programmatically. All IBM in watsonx.ai are indemnified by IBM.
For information about the GPU requirements for the supported foundation models, see Foundation models in the IBM Software Hub documentation.
To start inferencing a provided foundation model, complete these steps:
- Open the Prompt Lab.
- From the Model field, select View all foundation models.
- Click a foundation model tile, and then click Select model.
Custom foundation models
In addition to working with foundation models that are curated by IBM, you can upload and deploy your own foundation models. After the custom models are deployed and registered with watsonx.ai, you can create prompts that inference the custom models from the Prompt Lab or the watsonx.ai API.
To learn more about how to upload, register, and deploy a custom foundation model, see Deploying a custom foundation model.
Prompt-tuned foundation models
A subset of the provided foundation models can be customized for your needs by prompt tuning the model from the watsonx.ai API or Tuning Studio. A prompt-tuned foundation model relies on the underlying deployed foundation model. The underlying model can be deprecated.
You can customize the following foundation models by prompt tuning them in watsonx.ai:
Fine-tuned foundation models
A subset of the provided foundation models that you can customize for your needs by fine tuning the model from the watsonx.ai API or Tuning Studio. When you fine tune a foundation model, you generate and deploy a new model.
You can customize the following foundation models by fine tuning them in watsonx.ai:
- allam-1-13b-instruct
- granite-3b-code-instruct
- granite-8b-code-instruct
- granite-20b-code-instruct
- llama-2-13b-chat
- llama-3-1-8b-instruct
For more information, see Tuning Studio.
Learn more
For the complete list of models you can work with in watsonx.ai, see Supported foundation models.