Skip to main contentwatsonx Developer Hub

Models

Free token limit increased!
New 300k token limit for all new, free trials to use for LLM API calls and more. Sign up for free here.

Getting started

A collection of open source and IBM models are deployed in IBM watsonx.ai. You can interact with the deployed foundation models programmatically. Click on a provider to learn more about the supported models that IBM provides. The tables in the following sections give details on models IBM provides, including model ID, token limits, and pricing.

Overview

A collection of open source and IBM foundation models are deployed in IBM watsonx.ai. You can prompt the deployed foundation models programmatically.

To understand how the model provider, instruction tuning, token limits, and other factors can affect which model to choose, see Choosing a foundation model.

Listing available models programmatically

You can use the List the available foundation models method to return all the foundation models, including Tech Preview models.

Replace {token} and {watsonx_ai_url} with values from your account.

1curl -X GET \
2-H "Authorization: Bearer {token}" \
3-H "Content-Type: application/json" \
4-H "Accept: application/json" \
5"{watsonx_ai_url}/ml/v1/foundation_model_specs?version=2024-05-31&tech_preview=true"

After you find the model you want to use, copy the Model ID.

Supporting your use cases

To understand how models can address your use case, including information on model modalities, supported languages, tuning, and indemnification, see our product documentation on choosing a model.

Some IBM foundation models are also available from Hugging Face. License terms for IBM models that you access from Hugging Face are available from the Hugging Face website. For more information about contractual protections related to IBM indemnification for IBM foundation models that you access in watsonx.ai, see the IBM Client Relationship Agreement and IBM watsonx.ai service description.

Supported API functionality by model

ModelModel IDText GenerationChat CompletionTool CallingVision
granite-3-3-8b-instruct
ibm/granite-3-3-8b-instruct
granite-13b-instruct-v2
ibm/granite-13b-instruct-v2
granite-8b-japanese (Deprecated)
ibm/granite-8b-japanese
granite-3-8b-base
ibm/granite-3-8b-base
granite-3-2b-instruct
ibm/granite-3-2b-instruct
granite-3-8b-instruct
ibm/granite-3-8b-instruct
granite-3-2-8b-instruct
ibm/granite-3-2-8b-instruct
granite-guardian-3-2b
ibm/granite-guardian-3-2b
granite-guardian-3-8b
ibm/granite-guardian-3-8b
granite-3b-code-instruct (Deprecated)
ibm/granite-3b-code-instruct
granite-8b-code-instruct
ibm/granite-8b-code-instruct
granite-20b-code-instruct (Deprecated)
ibm/granite-20b-code-instruct
granite-34b-code-instruct (Deprecated)
ibm/granite-34b-code-instruct
granite-vision-3-2-2b
ibm/granite-vision-3-2-2b
flan-t5-xl-3b
google/flan-t5-xl
flan-t5-xxl-11b
google/flan-t5-xxl
flan-ul2-20b
google/flan-ul2
llama-4-maverick-17b-128e-instruct-fp8
meta-llama/llama-4-maverick-17b-128e-instruct-fp8
llama-4-scout-17b-16e-instruct Beta
meta-llama/llama-4-scout-17b-16e-instruct
llama-3-3-70b-instruct
meta-llama/llama-3-3-70b-instruct
llama-3-2-1b-instruct
meta-llama/llama-3-2-1b-instruct
llama-3-2-3b-instruct
meta-llama/llama-3-2-3b-instruct
llama-3-2-11b-vision-instruct
meta-llama/llama-3-2-11b-vision-instruct
llama-3-2-90b-vision-instruct
meta-llama/llama-3-2-90b-vision-instruct
llama-guard-3-11b-vision-instruct
meta-llama/llama-guard-3-11b-vision
llama-3-1-8b-instruct (Deprecated)
meta-llama/llama-3-1-8b-instruct
llama-3-1-70b-instruct (Deprecated)
meta-llama/llama-3-1-70b-instruct
llama-2-13b-chat (Deprecated)
meta-llama/llama-2-13b-chat
mistral-small-3-1-24b-instruct-2503
mistralai/mistral-small-3-1-24b-instruct-2503
mistral-large
mistralai/mistral-large
mistral-medium-2505
mistralai/mistral-medium-2505
mistral-small-24b-instruct-2501 (Deprecated)
mistralai/mistral-small-24b-instruct-2501
mixtral-8x7b-instruct-v01 (Deprecated)
mistralai/mixtral-8x7b-instruct-v01
pixtral-12b
mistralai/pixtral-12b
elyza-japanese-llama-2-7b-instruct
elyza/elyza-japanese-llama-2-7b-instruct
jais-13b-chat
core42/jais-13b-chat
allam-1-13b-instruct
sdaia/allam-1-13b-instruct

IBM foundation models

IBM Granite

Model NameModel IDMax Tokens (input + output)Input Price (USD/1,000 tokens)Output Price (USD/1,000 tokens)
granite-3-3-8b-instruct
ibm/granite-3-3-8b-instruct
131,072$0.0002$0.0002
granite-13b-instruct-v2
ibm/granite-13b-instruct-v2
8,192$0.0006$0.0006
granite-8b-japanese
ibm/granite-8b-japanese
4,096$0.0006$0.0006
granite-3-8b-base
ibm/granite-3-8b-base
4,096$0.0006$0.0006
granite-3-2b-instruct
ibm/granite-3-2b-instruct
131,072$0.0001$0.0001
granite-3-8b-instruct
ibm/granite-3-8b-instruct
131,072$0.0002$0.0002
granite-3-2-8b-instruct
ibm/granite-3-2-8b-instruct
131,072$0.0002$0.0002
granite-guardian-3-2b
ibm/granite-guardian-3-2b
131,072$0.0001$0.0001
granite-guardian-3-8b
ibm/granite-guardian-3-8b
131,072$0.0002$0.0002
granite-3b-code-instruct
ibm/granite-3b-code-instruct
128,000$0.0006$0.0006
granite-8b-code-instruct
ibm/granite-8b-code-instruct
128,000$0.0006$0.0006
granite-20b-code-instruct
ibm/granite-20b-code-instruct
8,192$0.0006$0.0006
granite-34b-code-instruct
ibm/granite-34b-code-instruct
8,192$0.0006$0.0006
granite-vision-3-2-2b
ibm/granite-vision-3-2-2b
131,072$0.0001$0.0001

Third Party Foundation Models

SDAIA ALLaM

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
allam-1-13b-instruct
sdaia/allam-1-13b-instruct
4,096$0.0018$0.0018

Code Llama

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
codellama-34b-instruct
codellama/codellama-34b-instruct-hf
16,384$0.0018$0.0018

Core 42

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
jais-13b-chat
core42/jais-13b-chat
2,048$0.0018$0.0018

Elyza

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
elyza-japanese-llama-2-7b-instruct
elyza/elyza-japanese-llama-2-7b-instruct
4,096$0.0018$0.0018

Google Flan

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
flan-t5-xl-3b
google/flan-t5-xl
4,096$0.0006$0.0006
flan-t5-xxl-11b
google/flan-t5-xxl
4,096$0.0018$0.0018
flan-ul2-20b
google/flan-ul2
4,096$0.0050$0.0050

Meta Llama

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
llama-3-2-1b-instruct
meta-llama/llama-3-2-1b-instruct
131,072$0.0001$0.0001
llama-3-2-3b-instruct
meta-llama/llama-3-2-3b-instruct
131,072$0.00015$0.00015
llama-3-2-11B-vision-instruct
meta-llama/llama-3-2-11b-vision-instruct
131,072$0.00035$0.00035
llama-3-2-90B-vision-instruct
meta-llama/llama-3-2-90b-vision-instruct
131,072$0.0020$0.0020
llama-guard-3-11B-vision-instruct
meta-llama/llama-guard-3-11b-vision
131,072$0.00035$0.00035
llama3-llava-next-8b-hf (Deprecated)
meta-llama/llama3-llava-next-8b-hf
8,192$0.0006$0.0006
llama-3-1-8b-instruct
meta-llama/llama-3-1-8b-instruct
131,072$0.0006$0.0006
llama-3-1-70b-instruct
meta-llama/llama-3-1-70b-instruct
131,072$0.0018$0.0018
llama-3-405b-instruct
meta-llama/llama-3-405b-instruct
16,384$0.0050$0.0160
llama-3-8b-instruct
meta-llama/llama-3-8b-instruct
8,192$0.0006$0.0006
llama-3-70b-instruct
meta-llama/llama-3-70b-instruct
8,192$0.0018$0.0018
llama-2-13b-chat (Deprecated)
meta-llama/llama-2-13b-chat
4,096$0.0006$0.0006
llama2-13b-dpo-v7
mnci/llama2-13b-dpo-v7
4,096$0.0018$0.0018

Mistral

Model NameModel IDMax TokensInput Price (per 1K tokens)Output Price (per 1K tokens)
mistral-small
mistralai/mistral-small-24b-instruct-2501
32,768$0.00035$0.00035
mistral-large
mistralai/mistral-large
32,768$0.0100$0.0100
mixtral-8x7b-instruct-v01
mistralai/mixtral-8x7b-instruct-v01
32,768$0.0006$0.0006

IBM Embedding Models

IBM Slate

Model NameModel IDMax Input TokensDimensionsPrice (per 1K tokens)
slate-125m-english-rtrvr-v2
ibm/slate-125m-english-rtrvr-v2
512768$0.0001
slate-125m-english-rtrvr
ibm/slate-125m-english-rtrvr
512768$0.0001
slate-30m-english-rtrvr-v2
ibm/slate-30m-english-rtrvr-v2
512384$0.0001
slate-30m-english-rtrvr
ibm/slate-30m-english-rtrvr
512384$0.0001

Third Party Embedding Models

Sentence Transformers

Model NameModel IDProviderMax Input TokensDimensionsPrice (per 1K tokens)
all-minilm-l12-v2
sentence-transformers/all-minilm-l12-v2
Sentence Transformers256384$0.0001

Multilingual E5

Model NameModel IDProviderMax Input TokensDimensionsPrice (per 1K tokens)
multilingual-e5-large
intfloat/multilingual-e5-large
Microsoft5121,024$0.0001

IBM Time Series Models

IBM Granite Time Series

Model NameModel IDMax Data PointsInput Price (per 1K pts)Output Price (per 1K pts)
granite-ttm-512-96-r2
ibm/granite-ttm-512-96-r2
608$0.13$0.38
granite-ttm-1024-96-r2
ibm/granite-ttm-1024-96-r2
1120$0.13$0.38
granite-ttm-1536-96-r2
ibm/granite-ttm-1536-96-r2
1536$0.13$0.38

Note: The max data points limit for timeseries models is per combination of a channel and ID for multivariate forecasting, meaning the model can process a combination of # of channels X # of IDs # Max Data Points

Third Party Re-rank Models

MS Marco

Model NameModel IDMax DocumentsPrice (per 1K tokens)
ms-marco-MiniLM-L-12-v2
cross-encoder/ms-marco-MiniLM-L-12-v2
50$0.000005