Foundation model deployment methods in watsonx.ai

You can choose from a collection of third-party and IBM foundation models to inference in IBM watsonx.ai. Find foundation models that best suit the needs of your generative AI application and your budget.

You can host foundation models in watsonx.ai in various ways.

If you want to deploy foundation models in your own data center, you can purchase watsonx.ai software. For more information, see Overview of IBM watsonx.ai and IBM watsonx.governance software.

Based on how foundation models are hosted in watsonx.ai, they are categorized as:

Deployment methods comparison

To help you choose the deployment method that best fits your use case, review the comparison table.

Table 1. Differences between foundation model deployment methods
Deployment type Available from Deployment mechanism Hosting environment Billing method Deprecation policy
Foundation models provided with watsonx.ai • Resource hub>Pay per token
• Prompt Lab
Curated and deployed by IBM Multitenant hardware By tokens used Deprecated according to published lifecycle. See Foundation model lifecycle.
Deploy on demand foundation models • Resource hub>Pay by the hour
• Prompt Lab
Curated and deployed by IBM at your request Dedicated hardware By hour deployed Deprecated according to published lifecycle. See Foundation model lifecycle.
Custom foundation models • Prompt Lab Curated and deployed by you Dedicated hardware By hour deployed Not deprecated
Fine-tuned foundation models Tuning Studio Tuned and deployed by you Multitenant or dedicated hardware • Training is billed by CUH
• Inferencing is billed by tokens used
Deprecated when the underlying model is deprecated unless you add the underlying model as a custom foundation model

For details on how model pricing is calculated and monitored, see Billing details for generative AI assets.

Provided foundation models that are ready to use

A collection of third-party and IBM foundation models are deployed on multitenant hardware in IBM watsonx.ai by IBM. You can prompt these foundation models in the Prompt Lab or programmatically. You pay based on the number of tokens used.

To start inferencing a provided foundation model, complete these steps:

  1. From the main menu, select Resource hub.
  2. Click View all in the Pay per token section.
  3. Click a foundation model tile, and then click Open in Prompt Lab.

Deploy on demand foundation models

A deploy on demand model is an instance of an IBM-curated foundation model that you deploy and that is dedicated for the exclusive use of your organization. Only colleagues who are granted access to the deployment can inference the foundation model. A dedicated deployment means faster and more responsive interactions without rate limits.

To work with a deploy on demand foundation model, complete these steps:

  1. From the main menu, select Resource hub.
  2. Click View all in the Pay by the hour section.
  3. Click a foundation model tile, and then click Deploy.

For more information, see Deploying foundation models on-demand.

Custom foundation models

In addition to working with foundation models that are curated by IBM, you can upload and deploy your own foundation models. After the custom models are deployed and registered with watsonx.ai, you can create prompts that inference the custom models from the Prompt Lab or the watsonx.ai API.

The instance of the custom foundation model that you deploy is dedicated for your use. A dedicated deployment means faster and more responsive interactions. You pay for hosting the foundation model by the hour.

To learn more about how to upload, register, and deploy a custom foundation model, see Deploying a custom foundation model.

Fine-tuned foundation models

A subset of the provided foundation models that you can customize for your needs by fine tuning the model from the watsonx.ai API or Tuning Studio. When you fine tune a foundation model, you generate and deploy a new model.

You can customize the various foundation models by fine tuning them in watsonx.ai. For details, see Choosing a foundation model to tune.

Learn more

For the complete list of models you can work with in watsonx.ai, see Supported foundation models.