IBM foundation models

In IBM watsonx.ai, you can use IBM foundation models that are built with integrity and designed for business.

The Granite family of IBM foundation models includes decoder-only models that can efficiently predict and generate language.

The models were built with trusted data that has the following characteristics:

Sourced from quality data sets in domains such as finance (SEC Filings), law (Free Law), technology (Stack Exchange), science (arXiv, DeepMind Mathematics), literature (Project Gutenberg (PG-19)), and more.
Compliant with rigorous IBM data clearance and governance standards.
Scrubbed of hate, abuse, and profanity, data duplication, and blocklisted URLs, among other things.

IBM is committed to building AI that is open, trusted, targeted, and empowering. For more information about contractual protections that are related to IBM indemnification, see the IBM Client Relationship Agreement. For more information about the IBM watsonx.ai service description with various cloud providers, see:

The following foundation models from IBM are available in watsonx.ai:

granite-4-h-small
granite-4-h-tiny
granite-4-h-micro
granite-3-3-2b-instruct
granite-3-3-8b-instruct
granite-3-2-8b-instruct
granite-3-1-8b-base
granite-3-1-8b-instruct
granite-3-8b-instruct
granite-3-8b-base
granite-7b-lab
granite-8b-japanese
granite-13b-chat-v2
granite-20b-multilingual
granite-3b-code-instruct
granite-8b-code-instruct
granite-20b-code-instruct
granite-20b-code-base-schema-linking
granite-20b-code-base-sql-gen
granite-34b-code-instruct
granite-guardian-3-8b
granite-ttm-512-96-r2
granite-ttm-1024-96-r2
granite-ttm-1536-96-r2
granite-vision-3-3-2b

For details about encoder models developed by IBM, see Supported encoder foundation models.

For details about third-party foundation models, see Third-party foundation models.

How to choose a model

To review factors that can help you to choose a model, such as supported tasks and languages, see Choosing a model and Foundation model benchmarks.

A deprecated foundation model is highlighted with a deprecated warning icon . For details about model deprecation and withdrawal, see Foundation model lifecycle.

Foundation model details

The foundation models in watsonx.ai support a range of use cases for both natural languages and programming languages. To see the types of tasks that these models can do, review and try the sample prompts. To view pricing details for deploy on demand foundation models, see Hourly billing rates for deploy on demand models.

Attention:

If your watsonx region is the Dallas data center on IBM Cloud, you can follow the model card links. Otherwise, search for the model name in the Resource hub. The model might not be available in all regions or cloud platforms.

Learn more

Read the following resources:

Website

Granite 4 models

The Granite 4.0 foundation models belong to the IBM Granite family of models. The granite-4-h-small, granite-4-h-micro and granite-4-h-tiny are instruction-following models built for structured and long-context capabilities. The models use fine-tuning, reinforcement learning, and model merging to improve performance. Granite 4.0 offers better instruction handling and tool use, making it well-suited for enterprise tasks.

Usage

Designed to respond to general instructions and can be used to build AI assistants for multiple domains, including business applications. The model is capable of common generative tasks, including summarization, text classification, text extraction, question-answering, retrieval augmented generation (RAG), function-calling tasks, Fill-In-the-Middle (FIM) code, and multilingual dialog use cases.

Size

Small: 30 billion parameters
Tiny: 7 billion parameters
Micro: 3 billion parameters

Availability

Small: Provided by IBM deployed on multitenant hardware and deploy on demand for dedicated use.
Tiny: Deploy on demand for dedicated use.
Micro: Deploy on demand for dedicated use.

API pricing tier

Small: Input tier: Class 18
Output tier: Class 5
For pricing details to deploy granite-4-h-small on multitenant hardware , see Table 2.
For pricing details on dedicated use, see Hourly billing rates for deploy on demand models.

Token limits

Context window length (input + output):

granite-4-h-small: 131,072
granite-4-h-tiny: 131,072
granite-4-h-micro: 131,072

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may fine-tune Granite 4.0 models beyond these languages.

Instruction tuning information

The Granite 4 models are fine-tuned from Granite-4.0-H-Small-Base using a combination of open source instruction datasets with permissive license and internally collected synthetic datasets.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

IBM-developed foundation models are considered part of the IBM Cloud Service. When you use an IBM-developed foundation model that is provided in watsonx.ai, the contractual protections related to IBM indemnification apply. For more information, see the IBM Client Relationship Agreement in addition to the service descriptions.

Learn more

Read the following resources:

Granite Instruct 3.3 Models

The Granite Instruct foundation models belong to the IBM Granite family of models. The granite-3-3-2b-instruct and granite-3-3-8b-instruct foundation models are Granite 3.3 Instruct foundation models. These models build on earlier iterations to improve reasoning, mathematics, coding, and instruction-following capabilities.

Usage

Designed to excel in long-context and instruction-following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more. Can be integrated into AI assistants across various domains.

Sizes

2 billion parameters
8 billion parameters

Availability

granite-3-3-2b-instruct: Deploy on demand for dedicated use.
granite-3-3-8b-instruct: Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Table 4.

Try it out

Answer a question with Granite 3.3

Token limits

Context window length (input + output)

2b: 131,072
8b: 131,072

Note: The maximum new tokens, which means the tokens generated by the foundation model per request, is limited to 16,384.

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. However, users may fine-tune these Granite models for languages beyond these 12 languages.

Supported programming languages

The Granite Instruct models are trained with code written in 116 programming languages.

Instruction tuning information

The Granite Instruct models are fine-tuned Granite Instruct base models trained on over 12 trillion tokens with a combination of permissively licensed open-source and proprietary instruction data.

Model architecture

Decoder

License

See the service description for watsonx.ai on AWS:

watsonx.ai AWS service description

See the service descriptions for the two services that comprise watsonx.ai on IBM Cloud:

Learn more

Read the following resources:

granite-3-2-8b-instruct

Granite 3.2 Instruct is a long-context foundation model that is fine-tuned for enhanced reasoning capabilities. The thinking capability is configurable, which means you can control when reasoning is applied.

Usage

Capable of common generative tasks, including code-related tasks, function-calling, and multilingual dialogs. Specializes in reasoning and long-context tasks such as summarizing long documents or meeting transcripts. Can respond to questions with answers that are grounded in context that is provided from long documents.

Size

8 billion parameters

Availability

Deploy on demand for dedicated use

API pricing tier

For pricing details, see Table 4.

Try it out

Sample prompt

Token limits

Context window length (input + output): 131,072

Note: The maximum new tokens, which means the tokens generated by the foundation model per request, is limited to 16,384.

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese

Instruction tuning information

Built on top of Granite-3.1-8B-Instruct, the model was trained by using a mix of permissively licensed open-source datasets and internally generated synthetic data designed for reasoning tasks.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

granite-3-1-8b-base

The granite-3-1-8b-base foundation model is a base model that belongs to the IBM Granite 3.1 family of models. The model extends the context length of granite-3-8b-base.

Usage

The Granite 3.1 base foundation model is a pre-trained autoregressive foundation model that is intended for tuning, summarization, text classification, extraction, question-answering, and other long-context tasks.

You can use the granite-3-1-8b-base foundation model for fine tuning purposes.

Size

8 billion parameters

Availability

Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Table 4.

Token limits

Context window length (input + output): 131,072

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. Users may fine-tune Granite 3.1 models for languages beyond these 12 languages.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

Granite Instruct 3.1 models

The Granite Instruct foundation models belong to the IBM Granite family of models. The granite-3-8b-instruct foundation model is a Granite 3.1 Instruct foundation model. The model builds on earlier iterations to provide better support for coding tasks and intrinsic functions for agents.

The granite-3-1-8b-instruct foundation model is a Granite 3.1 Instruct foundation model that is available for you to deploy on demand.

Usage

Granite Instruct foundation models are designed to excel in instruction-following tasks such as summarization, problem-solving, text translation, reasoning, code tasks, function-calling, and more.

Sizes

8 billion parameters

Availability

granite-3-8b-instruct: Provided by IBM deployed on multitenant hardware.

granite-3-1-8b-instruct: Deploy on demand for dedicated use.

Warning icon The granite-3-8b-instruct foundation model is deprecated. See Foundation model lifecycle.

API pricing tier

Class 12 for the granite-3-8b-instruct multitenant model deployment. For pricing details, see Table 2a.

For pricing details of the granite-3-1-8b-instruct deploy on demand model, see Table 4.

Try it out

Experiment with samples:

Token limits

Context window length (input + output):

granite-3-8b-instruct: 131,072

The maximum new tokens, which means the tokens generated by the foundation model per request, is limited to 8,192.

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese (Simplified).

Supported programming languages

The Granite Instruct models are trained with code that is written in 116 programming languages.

Instruction tuning information

The Granite Instruct models are fine-tuned Granite Instruct base models trained on over 12 trillion tokens with a combination of permissively licensed open-source and proprietary instruction data.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

granite-3-8b-base

The Granite 8b foundation model is a base model that belongs to the IBM Granite family of models. The model is trained on 10 trillion tokens that are sourced from diverse domains, and then further trained on 2 trillion tokens of high-quality data that was carefully chosen to enhance the model's performance on specific tasks.

Usage

The Granite 3.0 base foundation model is a baseline model that you can customize to create specialized models for specific application scenarios.

Size

8 billion parameters

Availability

Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Table 4.

Token limits

Context window length (input + output): 4,096

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese (Simplified).

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

granite-7b-lab

The granite-7b-lab foundation model is provided by IBM. The granite-7b-lab foundation model uses a novel alignment tuning method from IBM Research. Large-scale Alignment for chatBots, or LAB is a method for adding new skills to existing foundation models by generating synthetic data for the skills. Then, the data can be used to tune the foundation model.

Usage

Supports general purpose tasks, including extraction, summarization, classification, and more. Follow the prompting guidelines for tips on usage. For more information, see Prompting granite-7b-lab.

Size

7 billion parameters

Availability

Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Table 4.

Try it out

Sample: Generate a title for a passage

Token limits

Context window length (input + output): 8,192

Note: The maximum new tokens, which means the tokens generated by the foundation model per request, is limited to 4,096.

Supported natural languages

English

Instruction tuning information

The granite-7b-lab foundation model is trained iteratively by using the large-scale alignment for chatbots (LAB) methodology.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

granite-8b-japanese

The granite-8b-japanese model is provided by IBM. The granite-8b-japanese foundation model is an instruct variant that is initialized from the pre-trained Granite Base 8 Billion Japanese model and is trained to understand and generate Japanese text.

Usage

Useful for general purpose tasks in the Japanese language, such as classification, extraction, question-answering, and for language translation between Japanese and English.

Size

8 billion parameters

Availability

Deploy on demand for dedicated use except in the Frankfurt data center.

API pricing tier

For pricing details, see Table 4.

Try it out

Experiment with samples:

Token limits

Context window length (input + output): 4,096

Supported natural languages

English, Japanese

Instruction tuning information

The Granite family of models is trained on enterprise-relevant datasets from five domains: internet, academic, code, legal, and finance. The granite-8b-japanese model was pretrained on 1 trillion tokens of English and 0.5 trillion tokens of Japanese text.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

granite-13b-chat-v2

The granite-13b-chat-v2 model is provided by IBM. This model is optimized for dialog use cases and works well with virtual agent and chat applications.

Usage: Generates dialog output like a chatbot. Uses a model-specific prompt format. Includes a keyword in its output that can be used as a stop sequence to produce succinct answers. Follow the prompting guidelines for tips on usage. For more information, see Prompting granite-13b-chat-v2.

Size

13 billion parameters

Availability

Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Table 4.

Try it out

Sample prompt

Token limits

Context window length (input + output): 8,192

Supported natural languages

English

Instruction tuning information

The Granite family of models is trained on enterprise-relevant datasets from five domains: internet, academic, code, legal, and finance. Data used to train the models first undergoes IBM data governance reviews and is filtered of text that is flagged for hate, abuse, or profanity by the IBM-developed HAP filter. IBM shares information about the training methods and datasets used.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

granite-20b-multilingual

A foundation model from the IBM Granite family. The granite-20b-multilingual foundation model is based on the Granite Base 20 billion base model and is trained to understand and generate text in English, German, Spanish, French, and Portuguese.

Usage

English, German, Spanish, French, and Portuguese closed-domain question answering, summarization, generation, extraction, and classification tasks.

Size

20 billion parameters

Availability

Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Table 4.

Try it out

Sample prompt: Translate text from French to English

Token limits

Context window length (input + output): 8,192

Supported natural languages

English, German, Spanish, French, and Portuguese

Instruction tuning information

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

Granite Code models

The Granite Code models are foundation models from the IBM Granite family. The foundation models are instruction-following models fine-tuned using a combination of Git commits paired with human instructions and open-source synthetically generated code instruction datasets.

The granite-8b-code-instruct v2.0.0 foundation model can process larger prompts with an increased context window length.

Usage

The following Granite Code foundation models are designed to respond to coding-related instructions and can be used to build coding assistants:

granite-3b-code-instruct
granite-8b-code-instruct
granite-20b-code-instruct
granite-34b-code-instruct

The following Granite Code foundation models are instruction-tuned versions of the granite-20b-code-base foundation model that are designed for text-to-SQL generation tasks.

granite-20b-code-base-schema-linking
granite-20b-code-base-sql-gen

Sizes

3 billion parameters
8 billion parameters
20 billion parameters
34 billion parameters

Availability

granite-8b-code-instruct: Provided by IBM deployed on multitenant hardware

All Granite Code models: Deploy on demand for dedicated use.

API pricing tier

Class 1 for the multitenant model deployment. For pricing details, see Table 2a.

For pricing details for the deploy on demand models, see Table 4.

Try it out

Experiment with samples:

Token limits

Context window length (input + output)

granite-3b-code-instruct : 128,000
granite-8b-code-instruct : 128,000

When the model runs in the multitenant environment only, a maximum new tokens limit is applied, which means the tokens generated by the foundation model per request is limited to 8,192.
granite-20b-code-instruct : 8,192

The maximum new tokens, which means the tokens generated by the foundation model per request, is limited to 4,096.
granite-20b-code-base-schema-linking : 8,192
granite-20b-code-base-sql-gen : 8,192
granite-34b-code-instruct : 8,192

Supported natural languages

English

Supported programming languages

The Granite Code foundation models support 116 programming languages including Python, JavaScript, Java, C++, Go, and Rust. For the full list, see IBM foundation models.

Instruction tuning information

These models were fine-tuned from Granite Code base models on a combination of permissively licensed instruction data to enhance instruction-following capabilities including logical reasoning and problem-solving skills.

Model architecture

Decoder

License

See the service description for watsonx.ai on AWS:

watsonx.ai AWS service description

See the service descriptions for the two services that comprise watsonx.ai on IBM Cloud:

Learn more

Read the following resources:

granite-guardian-3-8b

The Granite Guardian foundation models belong to the IBM Granite family of models. The Granite Guardian foundation models are fine-tuned Granite Instruct models that is designed to detect risks in prompts and responses. The foundation model helps with risk detection along many key dimensions in the AI Risk Atlas.

The generation 3.1 version of the models are trained on a combination of human-annotated and additional synthetic data to improve performance for risks related to hallucination and jailbreak.

Usage

Designed to detect harm‑related risks within prompt text or model response (as guardrails). The models can be used in retrieval‑augmented generation use cases to assess context relevance (whether the retrieved context is relevant to the query), groundedness (whether the response is accurate and faithful to the provided context), and answer relevance (whether the response directly addresses the user's query).

Sizes

8 billion parameters

Availability

Provided by IBM deployed on multitenant hardware.

API pricing tier

Class 12. For details, see Table 2a.

Try it out

Experiment with samples:

Sample prompt: Classify prompts for safety with Granite

Token limits

Context window length (input + output)

granite-guardian-3-8b: 131,072

Note: The maximum new tokens, which means the tokens generated by the foundation model per request, is limited to 8,192.

Supported natural languages

English

Instruction tuning information

The Granite Guardian models are fine-tuned Granite Instruct models trained on a combination of human annotated and synthetic data.

Model architecture

Decoder

License

See the service description for watsonx.ai on AWS:

watsonx.ai AWS service description

See the service descriptions for the two services that comprise watsonx.ai on IBM Cloud:

Learn more

Read the following resources:

Granite time series models

Granite time series foundation models belong to the IBM Granite family of models. These models are compact, pretrained models for multivariate time series forecasting from IBM Research. The following versions are available to use for data forecasting in watsonx.ai:

granite-ttm-512-96-r2
granite-ttm-1024-96-r2
granite-ttm-1536-96-r2

Usage

You can apply one of these pretrained models on your target data to get an initial forecast without having to train the model on your data. When given a set of historic, timed data observations, the Granite time series foundation models can apply their understanding of dynamic systems to forecast future data values. These models work best with data points in minute or hour intervals and generate a forecast dataset with up to 96 data points per target channel.

Size

1 million parameters

Availability

Provided by IBM deployed on multitenant hardware.

API pricing tier

Input: Class 14
Output: Class 15

For pricing details, see Table 2b.

Try it out

See Forecast future values

Context length

Required minimum data points per channel in the API request:

granite-ttm-512-96-r2: 512
granite-ttm-1024-96-r2: 1,024
granite-ttm-1536-96-r2: 1,536

Supported natural languages

English

Instruction tuning information

The Granite time series models were trained on almost a billion samples of time series data from various domains, including electricity, traffic, manufacturing, and more.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

Granite Vision 3.3 2b

The Granite Vision 3.3 2b is a compact and efficient vision-language foundation model that is built for enterprise use cases. The granite-vision-3-3-2b model introduces novel experimental features such as image segmentation, doctag generation, and multi-page support. The model also offers enhanced safety compared to earlier Granite vision models.

Usage

The granite-vision-3-3-2b foundation model is designed for visual document understanding, enabling automated content extraction from tables, charts, infographics, plots, diagrams, and more.

Size

2 billion parameters

Availability

Deploy on demand for dedicated use.

API pricing tier

For pricing details, see Hourly billing rates for deploy on demand models.

Token limits

Context window length (input + output): 131,072

Supported natural languages

English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.

Instruction tuning information

The granite-vision-3-3-2b foundation model was trained on a curated instruction-following dataset, comprising diverse public datasets and synthetic datasets tailored to support a wide range of document understanding and general image tasks. The model was trained by fine-tuning the granite-3-2b-instruct foundation model with both image and text modalities.

Model architecture

Decoder

License

See the service descriptions for the two services that comprise watsonx.ai:

Learn more

Read the following resources:

IBM foundation models

How to choose a model

Foundation model details

Granite 4 models

Granite Instruct 3.3 Models

granite-3-2-8b-instruct

granite-3-1-8b-base

Granite Instruct 3.1 models

granite-3-8b-base

granite-7b-lab

granite-8b-japanese

granite-13b-chat-v2

granite-20b-multilingual

Granite Code models

granite-guardian-3-8b

Granite time series models

Granite Vision 3.3 2b

Granite model related resources