Foundation Models - IBM watsonx.ai

Choose the model you need

Select the IBM® Granite®, open-source or third-party model best suited for your business and deploy on-prem or in the cloud.

IBM's POV on AI models

Choose the right foundation model

What’s new?

New model

Granite 3.3 is now available in the watsonx® foundation model library.

Mistral Medium 3 now available in watsonx.ai®

New model feature

Meta Llama 4 Maverick and Llama 4 Scout is now available in watsonx.ai®

New model feature

New Granite 3.3 models have speech-to-text capabilities and improved language model performance

New model feature

Foundation model library

Choose the model that best fits your specific use case, budget considerations, regional interests and risk profile.

View the embedding model library

IBM models

Tailored for business, IBM Granite family of open, performant and trusted models deliver exceptional performance at a competitive price, without compromising safety.

View the IBM model library

Learn more about Granite

Meta Llama models

Llama models are open, efficient large language models designed for versatility and strong performance across a wide range of natural language tasks.

View the Meta model library

Learn more about our partnership

Mistral AI models

Mistral models are fast, performant, open-weight language models designed for modularity and optimized for text generation, reasoning and multilingual applications.

View the Mistral model library

Other third-party model providers

There are several foundation models from other providers available on watsonx.ai.

View the model library

IBM foundation models

See how Granite models were trained (PDF)

Learn more about Granite

Explore pricing options

Model name

Provider

Use cases

Context length

granite-4-h-small

New

Featured model

IBM

Supports questions-answering (Q&A), fill-in-the-middle, function calling, multilingual dialog, summarization, classification, generation, extraction, RAG and coding tasks.

128k

granite-3-3-8b-instruct

IBM

Supports reasoning and planning, questions and answers (Q&A), fill-in-the-middle support, summarization, classification, generation, extraction, RAG and coding tasks.

128k

granite-3-2-8b-instruct

IBM

Supports reasoning and planning, Q&A, summarization, classification, generation, extraction, RAG and coding tasks.

128k

granite-vision-3-2-2b

IBM

Supports image-to-text use cases for chart, graphs and infographics analysis, and context Q&A.

16,384

granite-3-2b-instruct (v3.1)

IBM

Supports Q&A, summarization, classification, generation, extraction, RAG and coding tasks.

128k

granite-3-8b-instruct (v3.1)

IBM

Supports Q&A, summarization, classification, generation, extraction, RAG and coding tasks.

128k

granite-guardian-3-8b (v3.1)

IBM

Supports detection of HAP/ or PII, jailbreaking, bias, violence and other harmful content.

128k

granite-guardian-3-2b (v3.1)

Deprecated

IBM

Supports detection of HAP or PII, jailbreaking, bias, violence and other harmful content.

128k

granite-13b-instruct

Deprecated

IBM

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

8192

granite-8b-code-instruct

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

128k

granite-timeseries-ttm-r2

IBM

Supports time-series forecasting, anomaly detection in sequential data, and predictive maintenance.

512
1024
1536

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Meta models

Learn more about our partnership

Model name

Provider

Use cases

Context length

llama-4-maverick-17b-128e-instruct-fp8

Meta

Multimodal reasoning, long-context processing (10M tokens), code generation and analysis, multilingual operations (200 languages supported), STEM and logical reasoning.

128k

llama-3-3-70b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

llama-3-2-90b-vision-instruct

Meta

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A and object identification.

128k

llama-3-2-11b-vision-instruct

Meta

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A and object identification.

128k

llama-guard-3-11b-vision

Meta

Supports image filtering, HAP or PII detection and harmful content filtering.

128k

llama-3-2-1b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

llama-3-2-3b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

llama-3-405b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

128k

llama-2-13b-chat

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish and Thai.

4096

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Mistral models

Model name

Provider

Use cases

Context length

mistral-medium-2505

New

Mistral AI

Supports coding, image captioning, image-to-text transcription, function calling, data extraction and processing, context Q&A, mathematical reasoning

128k

mistral-small-3-1-24b-instruct-2503

New

Mistral AI

Supports image captioning, image-to-text transcription, function calling, data extraction and processing, context Q&A and object identification

128k

pixtral-12b

Deprecated

Mistral AI

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A and object identification.

128k

mistral-large-2

Deprecated

Mistral AI

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in French, German, Italian, Spanish and English.

128k*

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Other third-party foundation models

Model name

Provider

Use cases

Context length

allam-1-13b-instruct

SDAIA

Supports Q&A, summarization, classification, generation, extraction, RAG and translation in Arabic.

4096

jais-13b-chat (Arabic)

core42

Supports Q&A, summarization, classification, generation, extraction and translation in Arabic.

2048

flan-t5-xl-3b

Deprecated

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks. Available for prompt-tuning.

4096

flan-t5-xxl-11b

Deprecated

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

flan-ul2-20b

Deprecated

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

gpt-oss-120b

Open AI

Supports private on-premises or edge deployment, reasoning workflows, tool-use (e.g. search, code execution), customizable chain-of-thought, structured outputs, adjustable reasoning effort

128K

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Embedding model library

Model name

Provider

Use cases

Context length

granite-embedding-107m-multilingual

New

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

granite-embedding-278m-multilingual

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

slate-125m-english-rtrvr-v2

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

slate-125m-english-rtrvr

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

slate-30m-english-rtrvr-v2

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

slate-30m-english-rtrvr

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Third-party embedding models

Model name

Provider

Use cases

Context length

all-mini-l6-v2

New

Microsoft

Retrieval augmented generation, semantic search and document comparison tasks.

256

multilingual-e5-large

Intel

Retrieval augmented generation, semantic search and document comparison tasks.

512

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Resources

How to choose the right AI foundation model

View the full Granite cookbook

Generative AI and ML for the enterprise

Hugging Face and IBM working together in open source

Client stories

What happens when you train a powerful AI model with your own unique data? Better customer experiences and faster value with AI. Explore these stories and see how.

Wimbledon

Wimbledon used watsonx.ai foundation models to train its AI to create tennis commentary.

Read the case study

The Recording Academy

The Recording Academy used AI Stories with IBM watsonx to generate and scale editorial content around GRAMMY nominees.

Read the announcement

The Masters

The Masters uses watsonx.ai to bring AI-powered hole insights combined with expert opinions to digital platforms.

Read the announcement

AddAI.Life

AddAI.Life uses watsonx.ai to access selected open-source large language models to build higher quality virtual assistants.

Read the case study

Take the next step

Start operationalizing and scaling generative AI and machine learning for business by exploring our free trial or booking a live demo.

More ways to explore

Connect with the IBM Community

Read SaaS documentation

Read software documentation

Find support

Intellectual property protection for IBM-developed watsonx.ai models

IBM believes in the creation, deployment and utilization of AI models that advance innovation across the enterprise responsibly. IBM watsonx AI portfolio has an end-to-end process for building and testing foundation models and generative AI. For IBM-developed models, we search for and remove duplication, and we employ URL blocklists, filters for objectionable content and document quality, sentence splitting and tokenization techniques, all before model training.

During the data training process, we work to prevent misalignments in the model outputs and use supervised fine-tuning to enable better instruction following so that the model can be used to complete enterprise tasks through prompt engineering. We are continuing to develop the Granite models in several directions, including other modalities, industry-specific content and more data annotations for training, while also deploying regular, ongoing data protection safeguards for IBM developed-models.

Given the rapidly changing generative AI technology landscape, our end-to-end processes are expected to continuously evolve and improve. As a testament to the rigor IBM puts into the development and testing of its foundation models, the company provides its standard contractual intellectual property indemnification for IBM-developed models, similar to those it provides for IBM hardware and software products.

Moreover, contrary to some other providers of large language models and consistent with the IBM standard approach on indemnification, IBM does not require its customers to indemnify IBM for a customer’s use of IBM-developed models. Also, consistent with the IBM approach to its indemnification obligation, IBM does not cap its indemnification liability for the IBM-developed models.

The current watsonx models now under these protections include:

(1) Slate family of encoder-only models

(2) Granite family of a decoder-only model

Learn more about licensing for Granite models (PDF)

Footnotes

^* Supported context length by model provider, but actual context length on platform is limited. For more information, please see Documentation.

Inference is billed in Resource Units. 1 Resource Unit is 1,000 tokens. Input and completion tokens are charged at the same rate. 1,000 tokens are generally about 750 words.

Not all models are available in all regions. See our documentation for details.

Context length is expressed in tokens.

The IBM statements regarding its plans, directions and intent are subject to change or withdrawal without notice at its sole discretion. See Pricing for more details. Unless otherwise specified under Software pricing, all features, capabilities and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities are the same.

Foundation models in watsonx.ai

Choose the model you need

What’s new?

Foundation model library

IBM foundation models

Meta models

Mistral models

Other third-party foundation models

Embedding model library

Third-party embedding models

Resources

Client stories

Intellectual property protection for IBM-developed watsonx.ai models

Footnotes