IBM watsonx™ models are designed for the enterprise and optimized for targeted business domains and use cases. Through the AI studio IBM® watsonx.ai™ we offer a selection of cost-effective, enterprise-grade foundation models developed by IBM, open-source models and models sourced from third-party providers to help clients and partners scale and operationalize artificial intelligence (AI) faster with minimal risk. You can deploy the AI models wherever your workload is, both on-premises and on hybrid cloud.
IBM takes a differentiated approach to delivering enterprise-grade foundation models:
On-premise foundation models from Mistral AI now available on Watsonx
Granite is named a Strong Performer in the The Forrester Wave™: AI Foundation Models for Language, Q2 2024
IBM watsonx foundation models library gives you the choice and flexibility to choose the model that best fits your business needs, regional interests and risk profiles from a library of proprietary, open-source and third-party models.
Granite is IBM's flagship series of LLM foundation models based on decoder-only transformer architecture. Granite language models are trained on trusted enterprise data spanning internet, academic, code, legal and finance. Currently we have four models in the Granite series.
Use IBM developed and open-sourced embedding models, deployed in IBM watsonx.ai, for retrieval augmented generation, semantic search and document comparison tasks.
Try watsonx.ai for free
See how Granite models were trained and data sources used
Trained on enterprise relevant content, IBM Granite meets rigorous data governance, regulatory and risk criteria defined and enforced by the IBM AI Ethics code and Chief Privacy Office.
Improved accuracy for targeted enterprise business domains such as finance and use cases like RAG, achieved through chat fine-tuning and model alignment techniques.
A competitively priced model with less infrastructure requirement, IP indemnification and an easy-to-use toolkit for model customization and application integration.
Select a generative foundation model that best fits your needs. After you have a short list of models for your use case, systematically test the models by using prompt engineering techniques to see which ones consistently return the desired results.
granite-20b-multilingual
IBM
Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in French, German, Portuguese, Spanish and English.
8192
0.60
granite-13b-chat
IBM
Supports questions and answers (Q&A), summarization, classification, generation, extraction and RAG tasks.
8192
0.60
granite-13b-instruct
IBM
Supports Q&A, summarization, classification, generation, extraction and RAG tasks.
8192
0.60
granite-34b-code-instruct
IBM
Task-specific model for code by generating, explaining and translating code from a natural language prompt.
8192
0.60
granite-20b-code-instruct
IBM
Task-specific model for code by generating, explaining and translating code from a natural language prompt.
8192
0.60
granite-8b-code-instruct
IBM
Task-specific model for code by generating, explaining and translating code from a natural language prompt.
128k
0.60
granite-3b-code-instruct
IBM
Task-specific model for code by generating, explaining and translating code from a natural language prompt.
128k
0.60
granite-8b-japanese
IBM
Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in Japanese.
4096
0.60
granite-7b-lab
IBM
Supports questions and answers (Q&A), summarization, classification, generation, extraction and RAG tasks.
8192
0.60
llama-3-2-90b-vision-instruct
Meta
Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A, object identification
128k*
2.00
llama-3-2-11b-vision-instruct
Meta
Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A, object identification
128k*
0.35
llama-guard-3-11b-vision
Meta
Supports image filtering, HAP/PII detection, harmful content filtering
128k*
0.35
llama-3-2-1b-instruct
Meta
Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai
128k*
0.10
llama-3-2-3b-instruct
Meta
Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai
128k*
0.15
llama-3-405b-instruct
Meta
Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai. |
128k*
Input: 5.00 / Output: 16.00
llama-3-1-70b-instruct
Meta
Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
128k
1.80
llama-3-1-8b-instruct
Meta
Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
128k
0.60
llama3-llava-next-8b-hf
llava-hf |
Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A, object identification.
8192
0.60
llama-3-8b-instruct
Meta
Supports summarization, classification, generation, extraction and translation tasks.
8192
0.60
llama-3-70b-instruct
Meta
Supports RAG, generation, summarization, classification, Q&A, extraction, translation and code generation tasks.
8192
1.80
llama2-13b-dpo-v7 (Korean)
MindsAndCompany
Supports Q&A, summarization, classification, generation, extraction and RAG tasks in Korean.
4096
1.80
allam-1-13b-instruct
SDAIA
Supports Q&A, summarization, classification, generation, extraction, RAG, and translation in Arabic.
4096
1.80
codellama-34b-instruct
Meta
Task-specific model for code by generating and translating code from a natural language prompt.
16384
1.80
mistral-large-2
Mistral AI
Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in French, German, Italian, Spanish and English.
128k*
10.00
mixtral-8x7b-instruct
Mistral AI
Supports Q&A, summarization, classification, generation, extraction, RAG and code generation tasks.
32768*
0.60
jais-13b-chat (Arabic)
core42
Supports Q&A, summarization, classification, generation, extraction and translation in Arabic.
2048
1.80
flan-t5-xl-3b
Supports Q&A, summarization, classification, generation, extraction and RAG tasks. Available for prompt-tuning.
4096
0.60
flan-t5-xxl-11b
Supports Q&A, summarization, classification, generation, extraction and RAG tasks.
4096
1.80
flan-ul2-20b
Supports Q&A, summarization, classification, generation, extraction and RAG tasks.
4096
5.00
elyza-japanese-llama-2-7b-instruct
ELYZA
Supports Q&A, summarization, RAG, classification, generation, extraction and translation tasks.
4096
1.80
mt0-xxl-13b
BigScience
Supports Q&A, summarization, classification and generation tasks.
4096
1.80
*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.
Embedding models convert input text into embeddings, which are dense vector representations of the input text. Embeddings capture nuanced semantic and syntactic relationships between words and passages in vector space.
slate-125m-english-rtrvr-v2
IBM
Retrieval augmented generation, semantic search and document comparison tasks.
512
0.10
slate-125m-english-rtrvr
IBM
Retrieval augmented generation, semantic search and document comparison tasks.
512
0.10
slate-30m-english-rtrvr-v2
IBM
Retrieval augmented generation, semantic search and document comparison tasks.
512
0.10
slate-30m-english-rtrvr
IBM
Retrieval augmented generation, semantic search and document comparison tasks.
512
0.10
all-minilm-l12-v2
OS-NLP-CV
Retrieval augmented generation, semantic search and document comparison tasks.
256
0.10
multilingual-e5-large
Microsoft
Retrieval augmented generation, semantic search and document comparison tasks.
512
0.10
*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.
IBM believes in the creation, deployment and utilization of AI models that advance innovation across the enterprise responsibly. IBM watsonx AI and data platform has an end-to-end process for building and testing foundation models and generative AI. For IBM-developed models, we search for and remove duplication, and we employ URL blocklists, filters for objectionable content and document quality, sentence splitting and tokenization techniques, all before model training.
During the data training process, we work to prevent misalignments in the model outputs and use supervised fine-tuning to enable better instruction following so that the model can be used to complete enterprise tasks via prompt engineering. We are continuing to develop the Granite models in several directions, including other modalities, industry-specific content and more data annotations for training, while also deploying regular, ongoing data protection safeguards for IBM developed-models.
Given the rapidly changing generative AI technology landscape, our end-to-end processes is expected to continuously evolve and improve. As a testament to the rigor IBM puts into the development and testing of its foundation models, the company provides its standard contractual intellectual property indemnification for IBM-developed models, similar to those it provides for IBM hardware and software products.
Moreover, contrary to some other providers of large language models and consistent with the IBM standard approach on indemnification, IBM does not require its customers to indemnify IBM for a customer's use of IBM-developed models. Also, consistent with the IBM approach to its indemnification obligation, IBM does not cap its indemnification liability for the IBM-developed models.
The current watsonx models now under these protections include:
(1) Slate family of encoder-only models.
(2) Granite family of a decoder-only model.
*Supported context length by model provider, but actual context length on platform is limited. For more information, please see Documentation.
Inference is billed in Resource Units. 1 Resource Unit is 1,000 tokens. Input and completion tokens are charged at the same rate. 1,000 tokens are generally about 750 words.
Not all models are available in all regions, see our documentation for details.
Context length is expressed in tokens.
The IBM statements regarding its plans, directions and intent are subject to change or withdrawal without notice at its sole discretion. See Pricing for more details. Unless otherwise specified under Software pricing, all features, capabilities and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities are the same.