Foundation models in watsonx.ai 
Explore the IBM library of foundation models on the watsonx platform to scale generative AI for your business with confidence
Start your free trial Book a live demo
Product screenshot of watsonx.ai foundation models
Enterprise-grade models with the power of choice

IBM watsonx™ models are designed for the enterprise and optimized for targeted business domains and use cases. Through the AI studio IBM® watsonx.ai™ we offer a selection of cost-effective, enterprise-grade foundation models developed by IBM, open-source models and models sourced from third-party providers to help clients and partners scale and operationalize artificial intelligence (AI) faster with minimal risk. You can deploy the AI models wherever your workload is, both on-premises and on hybrid cloud.

IBM takes a differentiated approach to delivering enterprise-grade foundation models:

  • Open: Bring best-in-class IBM and proven open-source models to watsonx foundation model library or your library.
  • Trusted: Train models on trusted and governed data for applications that require enterprise-level transparency, governance and performance.
  • Targeted: Designed for the enterprise and optimized for targeted business domains and use cases.
  • Empowering: Empower clients with competitively priced model choices to build AI that best suits their unique business needs and risk profiles.
Ebook: Explore how to choose the right foundation model
Llama 3 is now available in watsonx foundation model library.
IBM models

IBM watsonx foundation models library gives you the choice and flexibility to choose the model that best fits your business needs, regional interests and risk profiles from a library of proprietary, open-source and third-party models.

Granite, developed by IBM Research

Granite is IBM's flagship series of LLM foundation models based on decoder-only transformer architecture. Granite language models are trained on trusted enterprise data spanning internet, academic, code, legal and finance. Currently we have four models in the Granite series.

  1. Granite 13b chat: Chat model optimized for dialogue use cases and works well with virtual agent and chat applications
  2. Granite 13b instruct: Instruct model trained on high-quality finance data to perform well in finance domain tasks
  3. Granite multilingual: Trained to understand and generate text in English, German, Spanish, French and Portuguese
  4. Granite Japanese: Designed to perform language tasks on Japanese text
IBM Embedding Models

Use IBM developed and open-sourced embedding models, deployed in IBM watsonx.ai, for retrieval augmented generation, semantic search and document comparison tasks.

  • slate-125m-english-rtrvr: provided by IBM with an output dimension of 768
  • slate-30m-english-rtrvr: provided by IBM with an output dimension of 384

Try watsonx.ai for free

IBM Research report

See how Granite models were trained and data sources used

Why IBM Granite?         Trusted

Trained on enterprise relevant content, IBM Granite meets rigorous data governance, regulatory and risk criteria defined and enforced by the IBM AI Ethics code and Chief Privacy Office.

Performant

Improved accuracy for targeted enterprise business domains such as finance and use cases like RAG, achieved through chat fine-tuning and model alignment techniques.

Cost-effective

A competitively priced model with less infrastructure requirement, IP indemnification and an easy-to-use toolkit for model customization and application integration.

Foundation model library

Select a generative foundation model that best fits your needs. After you have a short list of models for your use case, systematically test the models by using prompt engineering techniques to see which ones consistently return the desired results.

See more watsonx pricing information
Model name Provider Use cases Context length Price USD/1 million tokens

granite-7b-lab

New

IBM

Supports questions and answers (Q&A), summarization, classification, generation, extraction and RAG tasks. 

 

8128

0.60

granite-13b-chat 

Featured model

IBM

Supports questions and answers (Q&A), summarization, classification, generation, extraction and RAG tasks. 

 

8192

0.60

granite-13b-instruct

Featured model

IBM 

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

8192

0.60

granite-20b-multilingual

Featured model

IBM

Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in French, German, Portuguese, Spanish and English.

8190

0.60

granite-8b-japanese

New

IBM

Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in Japanese. 

4096

0.60

llama-3-8b-instruct

New

Meta

Supports summarization, classification, generation, extraction and translation  tasks.

8192

0.60

llama-3-70b-instruct

New

Meta

Supports RAG, generation, summarization, classification, Q&A, extraction, translation and code generation tasks.

8192

1.80

llama-2-70b-chat

Meta

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

1.80

llama-2-13b-chat

Meta

Supports Q&A, summarization, classification, generation, extraction and RAG tasks. Available for prompt tuning. 

4096

0.60

llama2-13b-dpo-v7 (Korean)

New

MindsAndCompany

Supports Q&A, summarization, classification, generation, extraction and RAG tasks in Korean.

4096

1.80

codellama-34b-instruct

New

Meta

Task-specific model for code by generating and translating code from a natural language prompt.

16384

1.80

mixtral-8x7b-instruct

New

Mistral AI

Supports Q&A, summarization, classification, generation, extraction, RAG and code generation tasks.

32768

0.60

merlinite-7b

New

ibm-mistralai

Supports Q&A, summarization, classification, generation, extraction, RAG and code generation tasks.

32768

0.60

jais-13b-chat (Arabic)

New

core42

Supports Q&A, summarization, classification, generation, extraction and translation in Arabic.

2048

1.80

flan-t5-xl-3b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks. Available for prompt-tuning.

4096

0.60

flan-t5-xxl-11b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

1.80

flan-ul2-20b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

5.00

elyza-japanese-llama-2-7b-instruct

ELYZA

Supports Q&A, summarization, RAG, classification, generation, extraction and translation tasks. 

4096

1.80

mt0-xxl-13b

BigScience 

Supports Q&A, summarization, classification and generation tasks.

4096

1.80

starcoder-15.5b

BigCode

Task-specific model for code by generating and translating code from a natural language prompt.

8192

1.80

Embedding model library

Embedding models convert input text into embeddings, which are dense vector representations of the input text. Embeddings capture nuanced semantic and syntactic relationships between words and passages in vector space.

Model name Provider Use cases Context length Price USD/1 million tokens

slate-125m-english-rtrvr

New

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

 

512

0.10

slate-30m-english-rtrvr

New

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

 

512

0.10

Client stories

Businesses are excited about the prospect of tapping foundation models and ML in one place, with their own data, to accelerate generative AI workloads. 

Wimbledon used watsonx.ai foundation models to train its AI to create tennis commentary. Read the case study
The Recording Academy® used AI Stories with IBM watsonx to generate and scale editorial content around GRAMMY® nominees.
Watsonx brings AI-powered hole insights and Spanish language AI narration to the Masters Tournament digital platforms.
AddAI.Life uses watsonx.ai to access selected open-source large language models to build higher quality virtual assistants.

Intellectual property protection for AI models

IBM believes in the creation, deployment and utilization of AI models that advance innovation across the enterprise responsibly. IBM watsonx AI and data platform has an end-to-end process for building and testing foundation models and generative AI. For IBM-developed models, we search for and remove duplication, and we employ URL blocklists, filters for objectionable content and document quality, sentence splitting and tokenization techniques, all before model training.

During the data training process, we work to prevent misalignments in the model outputs and use supervised fine-tuning to enable better instruction following so that the model can be used to complete enterprise tasks via prompt engineering. We are continuing to develop the Granite models in several directions, including other modalities, industry-specific content and more data annotations for training, while also deploying regular, ongoing data protection safeguards for IBM developed-models.  

Given the rapidly changing generative AI technology landscape, our end-to-end processes is expected to continuously evolve and improve. As a testament to the rigor IBM puts into the development and testing of its foundation models, the company provides its standard contractual intellectual property indemnification for IBM-developed models, similar to those it provides for IBM hardware and software products.

Moreover, contrary to some other providers of large language models and consistent with the IBM standard approach on indemnification, IBM does not require its customers to indemnify IBM for a customer's use of IBM-developed models. Also, consistent with the IBM approach to its indemnification obligation, IBM does not cap its indemnification liability for the IBM-developed models.

The current watsonx models now under these protections include:

(1) Slate family of encoder-only models.

(2) Granite family of a decoder-only model.

Learn more about licensing for Granite models

Take the next step

Take the next step to start operationalizing and scaling generative AI and machine learning for business.

Start your free trial Book a live demo
More ways to explore Become an IBM Business Partner Connect with the IBM Community SaaS documentation Software documentation Support
Footnotes

Inference is billed in Resource Units. 1 Resource Unit is 1,000 tokens. Input and completion tokens are charged at the same rate. 1,000 tokens are generally about 750 words.

Not all models are available in all regions, see our documentation for details.

Context length is expressed in tokens.

The IBM statements regarding its plans, directions and intent are subject to change or withdrawal without notice at its sole discretion. See Pricing for more details. Unless otherwise specified under Software pricing, all features, capabilities and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities are the same.