Models
Featured foundation models
Getting started
A collection of open source and IBM models are deployed in IBM watsonx.ai. You can interact with the deployed foundation models programmatically. Click on a provider to learn more about the supported models that IBM provides. The tables in the following sections give details on models IBM provides, including model ID, token limits, and pricing.
Overview
A collection of open source and IBM foundation models are deployed in IBM watsonx.ai. You can prompt the deployed foundation models programmatically.
To understand how the model provider, instruction tuning, token limits, and other factors can affect which model to choose, see Choosing a foundation model.
Listing available models programmatically
You can use the List the available foundation models method to return all the foundation models, including Tech Preview models.
Replace {token}
and {watsonx_ai_url}
with values from your account.
1curl -X GET \ 2-H "Authorization: Bearer {token}" \ 3-H "Content-Type: application/json" \ 4-H "Accept: application/json" \ 5"{watsonx_ai_url}/ml/v1/foundation_model_specs?version=2024-05-31&tech_preview=true"
After you find the model you want to use, copy the Model ID.
Supporting your use cases
To understand how models can address your use case, including information on model modalities, supported languages, tuning, and indemnification, see our product documentation on choosing a model.
Some IBM foundation models are also available from Hugging Face. License terms for IBM models that you access from Hugging Face are available from the Hugging Face website. For more information about contractual protections related to IBM indemnification for IBM foundation models that you access in watsonx.ai, see the IBM Client Relationship Agreement and IBM watsonx.ai service description.
Supported API functionality by model
Model | Model ID | Text Generation | Chat Completion | Tool Calling | Vision |
---|---|---|---|---|---|
granite-3-3-8b-instruct | ibm/granite-3-3-8b-instruct | ✅ | ✅ | ✅ | |
granite-13b-instruct-v2 | ibm/granite-13b-instruct-v2 | ✅ | ✅ | ✅ | |
granite-8b-japanese (Deprecated) | ibm/granite-8b-japanese | ✅ | |||
granite-3-8b-base | ibm/granite-3-8b-base | ✅ | |||
granite-3-2b-instruct | ibm/granite-3-2b-instruct | ✅ | ✅ | ✅ | |
granite-3-8b-instruct | ibm/granite-3-8b-instruct | ✅ | ✅ | ✅ | |
granite-3-2-8b-instruct | ibm/granite-3-2-8b-instruct | ✅ | ✅ | ✅ | |
granite-guardian-3-2b | ibm/granite-guardian-3-2b | ✅ | ✅ | ✅ | |
granite-guardian-3-8b | ibm/granite-guardian-3-8b | ✅ | ✅ | ✅ | |
granite-3b-code-instruct (Deprecated) | ibm/granite-3b-code-instruct | ✅ | |||
granite-8b-code-instruct | ibm/granite-8b-code-instruct | ✅ | |||
granite-20b-code-instruct (Deprecated) | ibm/granite-20b-code-instruct | ✅ | |||
granite-34b-code-instruct (Deprecated) | ibm/granite-34b-code-instruct | ✅ | |||
granite-vision-3-2-2b | ibm/granite-vision-3-2-2b | ✅ | |||
flan-t5-xl-3b | google/flan-t5-xl | ✅ | ✅ | ||
flan-t5-xxl-11b | google/flan-t5-xxl | ✅ | ✅ | ||
flan-ul2-20b | google/flan-ul2 | ✅ | ✅ | ||
llama-4-maverick-17b-128e-instruct-fp8 | meta-llama/llama-4-maverick-17b-128e-instruct-fp8 | ✅ | ✅ | ✅ | |
llama-4-scout-17b-16e-instruct Beta | meta-llama/llama-4-scout-17b-16e-instruct | ✅ | ✅ | ✅ | |
llama-3-3-70b-instruct | meta-llama/llama-3-3-70b-instruct | ✅ | ✅ | ✅ | |
llama-3-2-1b-instruct | meta-llama/llama-3-2-1b-instruct | ✅ | ✅ | ||
llama-3-2-3b-instruct | meta-llama/llama-3-2-3b-instruct | ✅ | ✅ | ||
llama-3-2-11b-vision-instruct | meta-llama/llama-3-2-11b-vision-instruct | ✅ | ✅ | ✅ | ✅ |
llama-3-2-90b-vision-instruct | meta-llama/llama-3-2-90b-vision-instruct | ✅ | ✅ | ✅ | ✅ |
llama-guard-3-11b-vision-instruct | meta-llama/llama-guard-3-11b-vision | ✅ | ✅ | ✅ | ✅ |
llama-3-1-8b-instruct (Deprecated) | meta-llama/llama-3-1-8b-instruct | ✅ | ✅ | ||
llama-3-1-70b-instruct (Deprecated) | meta-llama/llama-3-1-70b-instruct | ✅ | ✅ | ||
llama-2-13b-chat (Deprecated) | meta-llama/llama-2-13b-chat | ✅ | ✅ | ||
mistral-small-3-1-24b-instruct-2503 | mistralai/mistral-small-3-1-24b-instruct-2503 | ✅ | ✅ | ||
mistral-large | mistralai/mistral-large | ✅ | ✅ | ✅ | |
mistral-medium-2505 | mistralai/mistral-medium-2505 | ✅ | ✅ | ✅ | |
mistral-small-24b-instruct-2501 (Deprecated) | mistralai/mistral-small-24b-instruct-2501 | ✅ | ✅ | ||
mixtral-8x7b-instruct-v01 (Deprecated) | mistralai/mixtral-8x7b-instruct-v01 | ✅ | ✅ | ✅ | |
pixtral-12b | mistralai/pixtral-12b | ✅ | ✅ | ✅ | |
elyza-japanese-llama-2-7b-instruct | elyza/elyza-japanese-llama-2-7b-instruct | ✅ | ✅ | ||
jais-13b-chat | core42/jais-13b-chat | ✅ | ✅ | ||
allam-1-13b-instruct | sdaia/allam-1-13b-instruct | ✅ | ✅ |
IBM foundation models
IBM Granite
Model Name | Model ID | Max Tokens (input + output) | Input Price (USD/1,000 tokens) | Output Price (USD/1,000 tokens) |
---|---|---|---|---|
granite-3-3-8b-instruct | ibm/granite-3-3-8b-instruct | 131,072 | $0.0002 | $0.0002 |
granite-13b-instruct-v2 | ibm/granite-13b-instruct-v2 | 8,192 | $0.0006 | $0.0006 |
granite-8b-japanese | ibm/granite-8b-japanese | 4,096 | $0.0006 | $0.0006 |
granite-3-8b-base | ibm/granite-3-8b-base | 4,096 | $0.0006 | $0.0006 |
granite-3-2b-instruct | ibm/granite-3-2b-instruct | 131,072 | $0.0001 | $0.0001 |
granite-3-8b-instruct | ibm/granite-3-8b-instruct | 131,072 | $0.0002 | $0.0002 |
granite-3-2-8b-instruct | ibm/granite-3-2-8b-instruct | 131,072 | $0.0002 | $0.0002 |
granite-guardian-3-2b | ibm/granite-guardian-3-2b | 131,072 | $0.0001 | $0.0001 |
granite-guardian-3-8b | ibm/granite-guardian-3-8b | 131,072 | $0.0002 | $0.0002 |
granite-3b-code-instruct | ibm/granite-3b-code-instruct | 128,000 | $0.0006 | $0.0006 |
granite-8b-code-instruct | ibm/granite-8b-code-instruct | 128,000 | $0.0006 | $0.0006 |
granite-20b-code-instruct | ibm/granite-20b-code-instruct | 8,192 | $0.0006 | $0.0006 |
granite-34b-code-instruct | ibm/granite-34b-code-instruct | 8,192 | $0.0006 | $0.0006 |
granite-vision-3-2-2b | ibm/granite-vision-3-2-2b | 131,072 | $0.0001 | $0.0001 |
Third Party Foundation Models
SDAIA ALLaM
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
allam-1-13b-instruct | sdaia/allam-1-13b-instruct | 4,096 | $0.0018 | $0.0018 |
Code Llama
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
codellama-34b-instruct | codellama/codellama-34b-instruct-hf | 16,384 | $0.0018 | $0.0018 |
Core 42
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
jais-13b-chat | core42/jais-13b-chat | 2,048 | $0.0018 | $0.0018 |
Elyza
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
elyza-japanese-llama-2-7b-instruct | elyza/elyza-japanese-llama-2-7b-instruct | 4,096 | $0.0018 | $0.0018 |
Google Flan
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
flan-t5-xl-3b | google/flan-t5-xl | 4,096 | $0.0006 | $0.0006 |
flan-t5-xxl-11b | google/flan-t5-xxl | 4,096 | $0.0018 | $0.0018 |
flan-ul2-20b | google/flan-ul2 | 4,096 | $0.0050 | $0.0050 |
Meta Llama
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
llama-3-2-1b-instruct | meta-llama/llama-3-2-1b-instruct | 131,072 | $0.0001 | $0.0001 |
llama-3-2-3b-instruct | meta-llama/llama-3-2-3b-instruct | 131,072 | $0.00015 | $0.00015 |
llama-3-2-11B-vision-instruct | meta-llama/llama-3-2-11b-vision-instruct | 131,072 | $0.00035 | $0.00035 |
llama-3-2-90B-vision-instruct | meta-llama/llama-3-2-90b-vision-instruct | 131,072 | $0.0020 | $0.0020 |
llama-guard-3-11B-vision-instruct | meta-llama/llama-guard-3-11b-vision | 131,072 | $0.00035 | $0.00035 |
llama3-llava-next-8b-hf (Deprecated) | meta-llama/llama3-llava-next-8b-hf | 8,192 | $0.0006 | $0.0006 |
llama-3-1-8b-instruct | meta-llama/llama-3-1-8b-instruct | 131,072 | $0.0006 | $0.0006 |
llama-3-1-70b-instruct | meta-llama/llama-3-1-70b-instruct | 131,072 | $0.0018 | $0.0018 |
llama-3-405b-instruct | meta-llama/llama-3-405b-instruct | 16,384 | $0.0050 | $0.0160 |
llama-3-8b-instruct | meta-llama/llama-3-8b-instruct | 8,192 | $0.0006 | $0.0006 |
llama-3-70b-instruct | meta-llama/llama-3-70b-instruct | 8,192 | $0.0018 | $0.0018 |
llama-2-13b-chat (Deprecated) | meta-llama/llama-2-13b-chat | 4,096 | $0.0006 | $0.0006 |
llama2-13b-dpo-v7 | mnci/llama2-13b-dpo-v7 | 4,096 | $0.0018 | $0.0018 |
Mistral
Model Name | Model ID | Max Tokens | Input Price (per 1K tokens) | Output Price (per 1K tokens) |
---|---|---|---|---|
mistral-small | mistralai/mistral-small-24b-instruct-2501 | 32,768 | $0.00035 | $0.00035 |
mistral-large | mistralai/mistral-large | 32,768 | $0.0100 | $0.0100 |
mixtral-8x7b-instruct-v01 | mistralai/mixtral-8x7b-instruct-v01 | 32,768 | $0.0006 | $0.0006 |
IBM Embedding Models
IBM Slate
Model Name | Model ID | Max Input Tokens | Dimensions | Price (per 1K tokens) |
---|---|---|---|---|
slate-125m-english-rtrvr-v2 | ibm/slate-125m-english-rtrvr-v2 | 512 | 768 | $0.0001 |
slate-125m-english-rtrvr | ibm/slate-125m-english-rtrvr | 512 | 768 | $0.0001 |
slate-30m-english-rtrvr-v2 | ibm/slate-30m-english-rtrvr-v2 | 512 | 384 | $0.0001 |
slate-30m-english-rtrvr | ibm/slate-30m-english-rtrvr | 512 | 384 | $0.0001 |
Third Party Embedding Models
Sentence Transformers
Model Name | Model ID | Provider | Max Input Tokens | Dimensions | Price (per 1K tokens) |
---|---|---|---|---|---|
all-minilm-l12-v2 | sentence-transformers/all-minilm-l12-v2 | Sentence Transformers | 256 | 384 | $0.0001 |
Multilingual E5
Model Name | Model ID | Provider | Max Input Tokens | Dimensions | Price (per 1K tokens) |
---|---|---|---|---|---|
multilingual-e5-large | intfloat/multilingual-e5-large | Microsoft | 512 | 1,024 | $0.0001 |
IBM Time Series Models
IBM Granite Time Series
Model Name | Model ID | Max Data Points | Input Price (per 1K pts) | Output Price (per 1K pts) |
---|---|---|---|---|
granite-ttm-512-96-r2 | ibm/granite-ttm-512-96-r2 | 608 | $0.13 | $0.38 |
granite-ttm-1024-96-r2 | ibm/granite-ttm-1024-96-r2 | 1120 | $0.13 | $0.38 |
granite-ttm-1536-96-r2 | ibm/granite-ttm-1536-96-r2 | 1536 | $0.13 | $0.38 |
Note: The max data points limit for timeseries models is per combination of a channel and ID for multivariate forecasting, meaning the model can process a combination of # of channels X # of IDs # Max Data Points
Third Party Re-rank Models
MS Marco
Model Name | Model ID | Max Documents | Price (per 1K tokens) |
---|---|---|---|
ms-marco-MiniLM-L-12-v2 | cross-encoder/ms-marco-MiniLM-L-12-v2 | 50 | $0.000005 |