Granite | IBM

Illustration of layered geometric shapes in a gradient of colors

IBM Granite 4.0: the next generation of Granite models

Lower costs and speed up workloads with efficient models designed for scalable, enterprise-ready AI adoption.

Meet Granite

Build and scale AI faster with customizable, open-source models optimized for enterprise workloads, cost efficiency, and flexible deployments.

Open

Open source under Apache 2.0, Granite ensures transparency, while enabling full customizability and deployment flexibility across any infrastructure.

Performant

The small, high-performing models are designed to maximize efficiency and scalability for essential enterprise tasks

Trusted

Eliminate the risk of “black box” AI with transparency into training data and processes, harm detection capabilities and built-in guardrails.

Learn more

Introducing Granite 4.0

Meet the models

Granite 4.0 Nano

Lightweight, local, and edge AI tasks where compute and connectivity are limited

Granite 4.0 Micro & Tiny

High-volume, low-complexity tasks where speed, cost, and efficiency are the top priority

Granite 4.0 Small

Enterprise workflows that require stronger performance without the cost of frontier models

Learn more

By the numbers

70%+

reduction in memory requirements

faster inferencing speeds

Granite-Docling: An ultra-compact model for document conversion

IBM Granite-Docling is an open-source and highly efficient model that converts documents into machine-readable formats while fully preserving layout.

Learn more

Granite Family of Small Models

Small Language Models (SLMs)

Core language models with reasoning, optimized for RAG and agentic workflows.

Embedding

Models generating high-quality text embeddings for semantic search, RAG, and contextual multi-turn information retrieval.

Document conversion

Ultra-compact vision-language model converting documents into structured, machine-readable formats while preserving layout, tables, and equations.

Vision

Efficient vision-language models for document and image understanding, enabling OCR, chart analysis, and enterprise content extraction.

Speech

Lightweight speech-language models for transcription and translation across 7 languages, delivering strong accuracy and efficiency.

Time series

Lightweight pre-trained models for fast, accurate time-series forecasting, optimized for efficient deployment across hardware environments.

Guardrail

Guardrail models detecting hallucinations, bias, harmful content, and jailbreaks, ensuring safe enterprise AI deployment across workflows.

Geospatial

NASA-IBM models for Earth observation, predicting biomass, climate, land temperature, and floods from large-scale satellite data.

Performance and efficiency

Granite 4.0 is engineered for efficiency, using less memory while delivering faster speeds and high performance. This balance allows enterprises to reduce costs and scale solutions faster across critical workloads.

Memory usage
Memory usage
Inference speed
Inference speed
General accuracy
General accuracy
RAG performance
RAG performance
Instruction following
Instruction following

Granite 4.0 RAM requirements circle chart

Granite 4.0 models are designed to do more with less. They use dramatically less memory - over 70% less than similar models - so organizations can run powerful AI on more affordable hardware. That means lower infrastructure costs, faster performance, and the ability to scale AI more easily across the business.

Granite 4.0 delivers consistently high throughput as workloads scale, handling larger batch sizes with ease while other models slow down. This ensures enterprises can maintain reliable performance for applications that need to serve many users or complex tasks at once.

Granite 4.0 General Performance scatter plot graph

Granite 4.0 delivers stronger accuracy with far lower memory requirements than competing models, even at smaller sizes. That efficiency translates into cost savings, greater accessibility, and the ability to deploy enterprise AI more widely and flexibly.

Granite 4.0 outperforms both similarly sized and larger open models on RAG tasks. By delivering higher accuracy without demanding extra infrastructure, Granite helps enterprises build more reliable, knowledge-grounded applications while keeping deployments efficient and cost-effective.

Granite 4.0 IFEval leaderboard bar chart

Granite 4.0 demonstrates industry-leading instruction-following performance among open models, an essential capability for agentic workflows. By balancing strong accuracy with smaller size, Granite provides enterprises with high-quality outputs for complex tasks at lower infrastructure costs than larger open models.

Trusted by companies across all industries

US Open

The U.S. Open uses Granite foundation models to provide commentary for hundreds of matches. They achieved a 220% increase in match reports created by leveraging Granite.

Scuderia Ferrari HP

Description: Explore how Scuderia Ferrari HP elevated their fan experience by delivering real-time race commentary powered by IBM Granite models, which interpret millions of data points to unlock deeper insights.

ESPN Fantasy Football

Thousands of hits per second and an infinite number of custom settings? The ESPN Fantasy app needed personalized explainability at scale for 12 million fans. Granite delivered.

Blue Pearl

Blue Pearl managed to cut 65% of data processing and analysis time through a job-matching engine built on Granite.

Crushbank

"At CrushBank, we've seen first-hand how IBM's open, efficient AI models deliver real value for enterprise AI – offering the right balance of performance, cost-effectiveness, and scalability. Granite 3.2 takes it further with new reasoning capabilities, and we're excited to explore them in building new agentic solutions."

David Tan
CTO
CrushBank

Food Ladder

Targeted food inequality at scale using Granite’s advanced reasoning capabilities to generate unit plans.

Granite for developers

Recipe: Document summarization

Build a document summarizer with IBM Granite to process documents beyond context window limits.

RAG with Langchain

Build a RAG pipeline with Granite to answer queries using an external knowledge base.

Recipe: Multimodal RAG

Build a multimodal RAG pipeline with Granite and Docling to query text, tables, and images.

Guide: Open-Source Models

See how open-source LLMs enable autonomy, cut costs, and help developers with evaluation, tuning, and deployment.

Tutorial: Time series forecasting

Use Granite time series models to perform zero-shot and fine-tuned time series forecasting.

Tutorial: Speech translation

Use automatic speech recognition (ASR) to generate a podcast transcript using Granite and watsonx.ai.

Tutorial: Local AI co-pilot

Build a local AI co-pilot using IBM Granite Code, Ollama, and Continue.

View the full granite cookbook

Build with Granite

Granite models drive the AI behind many IBM products and services. Discover ready-to-use solutions for code generation, application development, and model testing. All powered by IBM Granite.

Speed up coding and streamline development with AI and automation leveraging Granite models.

Build and deploy AI applications using Granite models or select from a variety of third-party models.

Develop and manage AI agents powered by Granite and explore the catalogue of pre-built agents.

Develop, test, and run LLMs, including Granite.

Analysts and leaderboards

IBM is named a Leader in Data Science & Machine Learning

Read the report to see how IBM empowers data scientists and machine learning engineers to build, deploy, and govern impactful AI applications across their enterprises.

GuardBench Leaderboard for Guardrail Models

Discover why Granite Guardium models rank in six of the top 10 spots on the GuardBench Leaderboard, excelling at identifying harmful or malicious prompts and LLM-generated responses.

Speech Recognition Open ASR Leaderboard

Explore the leaderboard in which IBM is ranked as the top speech models based on low word error rate.

Stanford Transparency Index

Explore why Granite earned a top rating in the Stanford Foundation Model Transparency Index, which assesses openness around data sources, dataset size, harmful content filtering, and other key transparency factors.

Stay on top of AI news

Blog | New OpenAI open source model is now available on watsonx.ai

OpenAI's gpt-oss-120B is now available in IBM watsonx.ai’s developer studio—giving you secure, flexible access to one of the most advanced open models

Podcast | AI agents in 2025: Why agentic commerce isn't ready for Black Friday yet

We explore agentic commerce ahead of Black Friday and why consumer-facing agents aren't ready. Plus, our experts debate which platforms will dominate the agentic future.

AI Think Newsletter | Get AI insights delivered

Get curated selection of AI topics, trends and research sent directly to your inbox.

Podcast | Creating Smarter Business with AI and Quantum

Malcolm Gladwell sits down with IBM Chairman and CEO Arvind Krishna to discuss the groundbreaking potential of quantum computing and the transformative impact of AI on business.

Next steps

Try Granite

Read Granite documentation

IBM believes in the creation, deployment and utilization of AI models that advance innovation across the enterprise responsibly. IBM watsonx AI and data platform have an end-to-end process for building and testing foundation models and generative AI. For IBM-developed models, we search for and remove duplication, and we employ URL blocklists, filters for objectionable content and document quality, sentence splitting and tokenization techniques, all before model training.

During the data training process, we work to prevent misalignments in the model outputs and use supervised fine-tuning to enable better instruction following so that the model can be used to complete enterprise tasks via prompt engineering. We are continuing to develop the Granite models in several directions, including other modalities, industry-specific content and more data annotations for training, while also deploying regular, ongoing data protection safeguards for IBM developed models.

Given the rapidly changing generative AI technology landscape, our end-to-end processes are expected to continuously evolve and improve. As a testament to the rigor IBM puts into the development and testing of its foundation models, the company provides its standard contractual intellectual property indemnification for IBM-developed models, similar to those it provides for IBM hardware and software products.

Moreover, contrary to some other providers of large language models and consistent with the IBM standard approach on indemnification, IBM does not require its customers to indemnify IBM for a customer's use of IBM-developed models. Also, consistent with the IBM approach to its indemnification obligation, IBM does not cap its indemnification liability for the IBM-developed models.

The current watsonx models now under these protections include:

(1) Slate family of encoder-only models.

(2) Granite family of a decoder-only model.

Learn more about licensing for Granite models

* How smaller, industry-tailored AI models can offer greater benefits
https://www.ft.com/partnercontent/ibm/how-smaller-industry-tailored-ai-models-can-offer-greater-benefits.html

¹Performance of Granite models conducted by IBM Research against leading open models across both academic and enterprise benchmarks - https://ibm.com/new/ibm-granite-3-0-open-state-of-the-art-enterprise-models