Granite Guardian Models

Granite is IBM's suite of generative AI models that provides enterprise control and customization through accessible model weights and architectures.

Granite Guardian Models

The Granite Guardian models are a collection of models designed to detect risks in prompts and responses. Trained on instruction fine-tuned Granite languages models, these models can help with risk detection along many key dimensions catalogued in the IBM AI Risk Atlas. These models are trained on unique data comprising human annotations from socioeconomically diverse people and synthetic data informed by internal red-teaming. 

 

Granite Guardian provides comprehensive coverage of Risks
Breadth – Spans Social risks, Security risks and risks specific to RAG use-cases
Depth – Enables explicit detection of social risks such as unethical behavior, social-bias, violence, profanity, and sexual content; security risks like jailbreaks; and RAG-specific hallucination risks
Models and Use Cases

Granite Guardian is provided in six different models with two architectures:

  • Granite-Guardian-3.0/3.1-2B
  • Granite-Guardian-3.0/3.1-8B
  • Granite-Guardian-3.2-5B
  • Granite-Guardian-3.2-3B-A800M
Use Case
 
Recommendation
 
Model Size
 

Detecting harm-related risks within prompt text or model response (as guardrails). These present two fundamentally different use cases as the former assesses user-supplied text while the latter evaluates model-generated text.

 

 

RAG (retrieval-augmented generation) use-case where the guardian model assesses three key issues: context relevance (whether the retrieved context is relevant to the query), groundedness (whether the response is accurate and faithful to the provided context),and answer relevance (whether the response directly addresses the user’s query).

 

Function calling risk detection within agentic workflows, where Granite Guardian evaluates intermediate steps for syntactic and semantic hallucinations. This includes assessing the validity of function calls and detecting fabricated information, particularly during query translation.

 

Ideal for edge devices

 


Ideal for limited computational power and resources, faster training times

 

2B/3B

 


8B/5B