Choosing a foundation model in watsonx.ai

There are many factors to consider when you choose a foundation model to use for inferencing from a generative AI project.

For example, for a solution that summarizes call center problem reports, you might want a foundation model with these characteristics:

Scores well on benchmarks for summarization tasks
Handles large amounts of text, which means a large context window length
Can interpret images of damaged items, so accepts inputs in both text and image modalities

Determine which factors are most important for you and your organization.

Tasks the model can do
Multimodal foundation models
Languages supported
License and IP indemnity terms
Model attributes, such as size, architecture, and context window length

After you have a short list of models that best fit your needs, you can test the models to see which ones consistently return the results you want.

Foundation models that support your use case

To get started, find foundation models that can do the type of task that you want to complete.

The following table shows the types of tasks that the foundation models in IBM watsonx.ai support. A checkmark (✓) indicates that the task that is named in the column header is supported by the foundation model. For some of the tasks, you can click a link to go to a sample prompt for the task.

Table 1a. Foundation model task support
Model	Conversation from Chat API	Tool interaction from Chat API	Retrieval-augmented generation (RAG)	Samples
ibm-defense-4-0-micro	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Chat API
ibm-defense-3-3-8b-instruct	✓	✓	✓ • RAG from Prompt Lab	• Chat API
granite-4-h-tiny	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Chat API
granite-4-h-small	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Chat API
granite-docling-258M	✓		✓ RAG from Prompt Lab	• Chat with image example • Chat API
granite-3-3-8b-instruct	✓		✓ • RAG from Prompt Lab • RAG from AutoAI	• Q&A
granite-13b-instruct-v2			✓ RAG from Prompt Lab	• Generation
granite-3-2b-instruct	✓			• Code • Chat API
granite-3-8b-instruct	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Code • Chat API
granite-guardian-3-2b			✓ RAG from Prompt Lab
granite-guardian-3-8b			✓ RAG from Prompt Lab
granite-3b-code-instruct				• Code
granite-8b-code-instruct				• Code
granite-20b-code-instruct	✓			• Code • Chat API
granite-20b-code-base-schema-linking				• Code
granite-20b-code-base-sql-gen				• Code
granite-34b-code-instruct	✓			• Code • Chat API
allam-1-13b-instruct				• Classification • Translation
codestral-22b				• Code
codestral-2501				• Code
codestral-2508				• Code
flan-t5-xl-3b			✓ RAG from Prompt Lab	✓
gpt-oss-20b	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Chat API
gpt-oss-120b	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Chat API
jais-13b-chat				• Dialog
llama-4-maverick-17b-128e-instruct-fp8	✓		✓ • RAG from Prompt Lab • RAG from AutoAI	• Dialog • Chat • Chat API
llama-4-scout-17b-16e-instruct	✓		✓ • RAG from Prompt Lab • RAG from AutoAI	• Dialog • Chat • Chat API
llama-3-3-70b-instruct	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Sample chat • Chat API
llama-3-2-1b-instruct		✓	✓ RAG from Prompt Lab	• Code • Dialog
llama-3-2-3b-instruct		✓	✓ RAG from Prompt Lab	• Code • Dialog
llama-3-2-11b-vision-instruct	✓	✓	✓ RAG from Prompt Lab	• Chat with image example • Chat API
llama-3-2-90b-vision-instruct	✓	✓	✓ RAG from Prompt Lab	• Chat with image example • Chat API
llama-3-1-8b			✓ RAG from Prompt Lab	• Dialog
llama-guard-3-11b-vision	✓		✓ RAG from Prompt Lab	• Classification • Chat with image example • Chat API
llama-3-1-8b-instruct	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Dialog • Chat API
llama-3-1-70b-instruct	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Dialog • Chat API
llama-3-405b-instruct	✓	✓	✓ RAG from Prompt Lab	• Dialog
llama-2-13b-chat			✓ RAG from Prompt Lab	• Dialog
ministral-8b-instruct				• Classification • Extraction •Summarization • Translation
mistral-large	✓	✓	✓ • RAG from Prompt Lab • RAG from AutoAI	• Classification • Extraction •Summarization • Code • Translation • Chat API
mistral-large-instruct-2411			✓ RAG from Prompt Lab	• Classification • Extraction •Summarization • Code • Translation
mistral-small-instruct				• Classification • Extraction •Summarization • Code • Translation
mistral-small-3-1-24b-instruct-2503	✓		✓ • RAG from Prompt Lab • RAG from AutoAI	• Chat API
mistral-small-24b-instruct-2501			✓ RAG from Prompt Lab	• Classification • Extraction • Generation •Summarization • Code • Translation
mixtral-8x7b-instruct-v01			✓ • RAG from Prompt Lab • RAG from AutoAI	• Classification • Extraction • Generation •Summarization • Code • Translation
pixtral-12b			✓ RAG from Prompt Lab	✓ Samples: • Classification • Extraction •Summarization • Chat with image example
pixtral-large-instruct-2411				• Chat with image example

To review various prompt samples that are grouped by task type, see Sample prompts.
To determine how well a foundation model can perform certain tasks, see Foundation model benchmarks.

Multimodal foundation models

Multimodal foundation models are capable of processing and integrating information from many modalities or types of data. These modalities can include text, images, audio, video, and other forms of sensory input.

The multimodal foundation models that are available from watsonx.ai can do the following types of tasks:

Image-to-text generation: Useful for visual question answering, interpretation of charts and graphs, captioning of images, and more.
Audio-to-text generation: Useful for speech recognition, transcription of spoken content, voice command understanding, meeting note generation, accessibility support, and more.

The following table lists the available foundation models that support modalities other than text-in and text-out.

Table 1b. Supported multimodal foundation models
Model	Input modalities	Output modalities
llama-4-maverick-17b-128e-instruct-fp8	image, text	text
llama-3-2-11b-vision-instruct	image, text	text
llama-3-2-90b-vision-instruct	image, text	text
llama-guard-3-11b-vision	image, text	text
ministral-8b-instruct-2512	image, text	text
mistral-large-2512	image, text	text
mistral-small-3-2-24b-instruct-2506	image, text	text
pixtral-12b	image, text	text

Foundation models that support your language

Many foundation models work well only in English. But some model creators include multiple languages in the pretraining data sets to fine-tune their model on tasks in different languages, and to test their model's performance in multiple languages. If you plan to build a solution for a global audience or a solution that does translation tasks, look for models that were created with multilingual support in mind.

The following table lists natural languages that are supported in addition to English by foundation models in watsonx.ai. For more information about the languages that are supported for multilingual foundation models, see the model card for the foundation model.

Table 2. Foundation models that support natural languages other than English
Model	Languages other than English
Granite 4.0 (granite-4-h-small, granite-4-h-micro, granite-4-h-tiny )	German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. You can fine tune these Granite models for languages beyond these 12 languages
Granite Instruct 3.3 (granite-3-3-2b-instruct, granite-3-3-8b-instruct)	German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese. You can fine tune these Granite models for languages beyond these 12 languages.
Granite Vision (granite-vision-3-3-2b, granite-vision-3-2-2b)	German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese
IBM Defense 4.0	German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, and Chinese.
allam-1-13b-instruct	Arabic
gpt-oss-120b	Multilingual
Llama 4 (llama-4-maverick-17b-128e-instruct-fp8)	Arabic, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, and Vietnamese.
Llama 3.3 (llama-3-3-70b-instruct)	German, French, Italian, Portuguese, Hindi, Spanish, and Thai
Llama 3.2 (llama-3-2-1b-instruct, llama-3-2-3b-instruct. Also llama-3-2-11b-vision-instruct, llama-3-2-90b-vision-instruct, and llama-guard-3-11b-vision with text-only inputs)	English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai
Llama 3.1 (llama-3-1-8b-instruct, llama-3-1-70b-instruct)	English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai
Ministral 3 (ministral-3b-instruct-2512, ministral-8b-instruct-2512, ministral-14b-instruct-2512)	French, Spanish, German, Italian, Portuguese, Dutch, Chinese, Japanese, Korean, Arabic, and dozens of other languages.
ministral-8b-instruct	Multilingual (See model card)
mistral-large-2512	French, German, Italian, Spanish, Chinese, Japanese, Korean, Portuguese, Dutch, Polish, and dozens of other languages.
Mistral Medium (mistral-medium-2505 , mistral-medium-2508)	Multilingual (See model card)
mistral-small-3-2-24b-instruct-2506	French, German, Greek, Hindi, Indonesian, Italian, Japanese, Korean, Malay, Nepali, Polish, Portuguese, Romanian, Russian, Serbian, Spanish, Swedish, Turkish, Ukrainian, Vietnamese, Arabic, Bengali, Chinese, Farsi.
voxtral-small-24b-2507	Spanish, French, Portuguese, Hindi, German, Dutch, Italian.

Model types and IP indemnification

Review the intellectual property indemnification policy for the foundation model that you want to use. Some third-party foundation model providers require you to exempt them from liability for any IP infringement that might result from the use of their AI models.

IBM-developed foundation models that are available from watsonx.ai have standard intellectual property protection, similar to what IBM provides for hardware and software products.

IBM extends its standard intellectual property indemnification to the output that is generated by covered models. Covered Models include IBM-developed and some third-party foundation models that are available from watsonx.ai. Third-Party Covered Models are identified in table 4.

The following table describes the different foundation model types and their indemnification policies. See the reference materials for full details.

Table 4. Indemnification policy details
Foundation model type	Indemnification policy	Foundation models	Details	Reference materials
IBM Covered Model	Uncapped IBM indemnification	• IBM Granite • IBM Slate	IBM-developed foundation models that are available from watsonx.ai.	License information
Third-Party Covered Model	Capped IBM indemnification	Mistral Commercial Models	Third-party covered models that are available from watsonx.ai.	License information
Non-IBM Product	No IBM indemnification	Various	Third-party models that are available from watsonx.ai and are subject to their respective license terms, including associated obligations and restrictions.	See model information.
Custom Model	No IBM indemnification	Various	Foundation models that you import to use in watsonx.ai are Client content.	Client is solely responsible for the selection and use of the model and output and compliance with third-party license terms, obligations, and restrictions.

For more information about third-party model license terms, see Third-party foundation models.

More considerations for choosing a model