IBM Granite Embedding Documentation

Overview

The Granite Embedding collection delivers innovative sentence-transformer models purpose-built for retrieval-based applications. Featuring a bi-encoder architecture, these models generate high-quality embeddings for textual inputs such as queries, passages, and documents, enabling seamless comparison through cosine similarity. The Granite Embedding lineup includes granite-embedding-30m-english, granite-embedding-125m-english, granite-embedding-107m-multilingual, and granite-embedding-278m-multilingual, each optimized to ensure strong alignment between query and passage embeddings.

Built on a foundation of carefully curated, permissibly licensed public datasets, the Granite Embedding models set a high standard for performance, achieving state-of-the-art results in their respective weight classes. Developed to meet enterprise-grade expectations, they are crafted transparently in accordance with IBM’s AI Ethics principles and offered under the Apache 2.0 license for both research and commercial innovation.

Looking ahead, the Granite Embedding series is gearing up to introduce sparse embedding models, designed to reduce computational complexity and enhance memory efficiency, further pushing the boundaries of efficiency and scalability.

Model cards

Granite Embedding 125m English

Granite Embedding 30m English

Granite Embedding 107m Multilingual

Granite Embedding 278m Multilingual

Run locally with Ollama

Learn more about Granite Embedding on Ollama.

Granite Embedding 278m Multilingual

Granite Embedding 30m English

Examples

Granite Embedding with sentence transformers

This is a simple example of how to use granite-embedding-30m-english model with sentence_transformers.

First, install the sentence transformers library

pip install sentence_transformers
Copy to clipboard

The model can then be used to encode pairs of text and find the similarity between their representations

from sentence_transformers import SentenceTransformer, util

model_path = "ibm-granite/granite-embedding-30m-english"
# Load the Sentence Transformer model
model = SentenceTransformer(model_path)

input_queries = [
    ' Who made the song My achy breaky heart? ',
    'summit define'
Copy to clipboard

Granite Embedding with Hugging Face transformers

This is a simple example of how to use the granite-embedding-30m-english model with the Transformers library and PyTorch.

First, install the required libraries

pip install transformers torch
Copy to clipboard

The model can then be used to encode pairs of text

import torch
from transformers import AutoModel, AutoTokenizer

model_path = "ibm-granite/granite-embedding-30m-english"

# Load the model and tokenizer
model = AutoModel.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)
model.eval()
Copy to clipboard

Granite Embedding with LangChain

This is how you could use our models for Retrieval using IBM LangChain.

First, install LangChain dependencies

pip install git+https://github.com/ibm-granite-community/utils \
#     "langchain_community<0.3.0" \
#     langchain-huggingface \
#     langchain-milvus \
#     replicate \
#     wget
Copy to clipboard

The below recipe, with granite-embedding-30m-english model, shows how to:

Setup an database: how to setup a local Milvus VectorDB, process the corpus to produce indexable documents, and ingest those documents using an embedding model.
Retrieve relevant passages from the database: how to use an embedding of the query to retrieve semantically similar passages.

from langchain_huggingface import HuggingFaceEmbeddings
from langchain_milvus import Milvus
from langchain.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
import uuid
import os, wget

#load the embedding model
embeddings_model = HuggingFaceEmbeddings(model_name="ibm-granite/granite-embedding-30m-english")
Copy to clipboard

Models: Granite Guardian

Models: Granite Time Series

Granite Embedding

Table of contents