Granite is IBM's suite of generative AI models that provides enterprise control and customization through accessible model weights and architectures.

Granite 3.1

Granite 3.1 language models are lightweight, state-of-the-art, open foundation models that natively support multilingual experience, coding, reasoning, and tool calling, including the potential to be run on constrained compute resources. All the models are publicly released under an Apache 2.0 license for both research and commercial use. The models' data curation and training procedure were designed for enterprise usage and customization, with a process that evaluates datasets for governance, risk and compliance (GRC) criteria, in addition to IBM's standard data clearance process and document quality checks.



Parameters

2B, 8B Dense
1B, 3B MoE

Training Data

Web data +
Synthetic data +
Publicly available datasets with
permissible licenses

Input Modalities

Multilingual Text

Output Modalities

Multilingual Text and Code

Contenxt Length

128k

Training Tokens

Up to 12T tokens

Knowledge Cutoff

April 2024

Features

Lightweight

Our largest dense model has 8 billion parameters, and our smallest MoE model has an activated parameter count of 400 million, enabling hosting, or even fine-tuning, on more limited compute resources.

 

Trustworthy Enterprise-Grade LLM

All our models are trained on license-permissible data collected following IBM's Al Ethics principles for trustworthy enterprise usage. We describe in great detail the sources of our data, data processing pipeline, and data mixture search to strengthen trust in our models for mission-critical and regulated applications.

 

Reduced operational costs

Runs training and inference tasks at a fraction of the cost of leading closed models, significantly reducing
operational costs.

 

Competitive performance

All our models demonstrate competitive performance on par with leading foundation models, evaluated on multiple benchmark datasets.

 

Robust Models with Permissive License

Combined with excellent performance across various benchmarks, our Granite 3.1 models provide a great foundation for enterprise customization. All our models, including instruct variants, use an Apache 2.0 license, allowing for more consumer and enterprise usage flexibility over the more restrictive licenses of other available models in the same class.

 

Architecture

Granite 3.1 models come in 4 varying sizes and 2 architectures.

 

Dense Models: 2B and 8B parameter models

  • Trained on 12 trillion tokens in total.

  • State-of-the-art training and data recipes 12T+ tokens training data

  • Designed for enterprise tasks:
    • Language (RAG, summarization, entity extraction, classification, etc.),
    • Code (generation, translation, bug fixing),
    • Agents (tool use, advanced reasoning),
    • Multilingual support (en, de, es, fr, ja, pt, ar, cs, it, ko, nl, zh)

Mixture-of-Expert (MoE) Models: Sparse 1B and 3B MoE models

  • 400M and 800M activated parameters respectively, trained on 10 trillion tokens in total.

  • 10T+ tokens training data

  • Runs with <1B parameters at inference time with minimal performance trade-off

  • Ideal for on-device applications or runtimes requiring extremely low latency

Accordingly, these options provide a range of models with different compute requirements to choose from, with appropriate trade-offs with their performance on downstream tasks. At each scale, we release base model — checkpoints of models after pretraining, as well as instruct checkpoints — models finetuned for dialogue, instruction-following, helpfulness, and safety.

Competitive Performance

Granite 3.1 compared against Benchmarks Genmma-2, Llama-3.1, Qwen-2.5, and Ministral

Use Cases
 
Use Case
 
Recommendation
 
Model Size
 

Personal information management
Multilingual knowledge retrieval
Rewriting tasks running locally on edge

 

Retrieval, Summarization, faster inference

1B 
(MoE - A400M) 
(base, Instruct)

Mobile AI-powered writing assistant

 

Retrieval, Summarization, faster inference

 

2B
(base, Instruct)

Mobile AI-powered writing assistant

 

Query and prompt rewriting, mobile AI-powered
writing assistant, edge devices

3B
(MoE - A800M)
(base, Instruct)

Text summarization
Text classification
Sentiment analysis
Language translation
Entity recognition

Ideal for limited computational power and resources,
faster training times

8B 
(base, Instruct)

Text summarization
Text classification
Sentiment analysis
Language translation
Entity recognition

Ideal for limited computational power and resources, faster training
times, faster inference

8B 
(Instruct-Accelerator)