In this tutorial, we will use open source IBM® Granite® 4.0 and IBM Docling to summarize Lewis Carroll’s “Alice’s Adventures in Wonderland.”
Machine learning algorithms offer a variety of tools to address specific tasks. Text or document summarization tasks specifically can be daunting. This scenario is especially true when dealing with particularly long documents.
Automatic text summarization is a natural language processing (NLP) method that condenses information from one or more input text documents. It takes items such as news articles or research papers that contain large datasets or emails and turns them into a coherent output text. There are two types of summarization approaches: extractive and abstractive.
Extractive summarization extracts specific sentences from the original text documents. Instead of focusing on new text generation, this approach focuses on selecting the most relevant sentences from the document being summarized.
Alternatively, abstractive summarization generates original summaries by using sentences not found in the original text documents. Such generation leverages deep learning and transformer architecture used by large language models (LLMs) to produce semantically meaningful text sequences.
Although abstractive summarization might require more computational resources, summarization models can be fine-tuned on domain-specific training data (for example, healthcare, finance or law) where the LLM specializes in generating industry-specific summaries. For real-world applications, generative AI summarization techniques can accelerate workflows across many industries through the production of high-quality summaries.
You need an IBM Cloud® account to create a watsonx.ai® project.
Several Python versions can work for this tutorial. At the time of publishing, we recommend downloading 3.10, 3.11 or 3.12.
To get started with IBM Granite on IBM watsonx.ai, follow this recipe.
This tutorial is available on GitHub.
Use the following commands in your terminal to create a virtual environment and then activate it.
python -m venv myenv
source myenv/bin/activate
Create a watsonx.ai Runtime service instance (select your appropriate region and choose the Lite plan, which is a free instance).
Generate an application programming interface (API) key.
Associate the watsonx.ai Runtime service instance with the project that you created in watsonx.ai.
! echo "::group::Install Dependencies"
%pip install uv
! uv pip install git+https://github.com/ibm-granite-community/utils.git \
langchain_ibm \
langchain_community \
transformers \
docling \
bert-score \
rouge-score
! echo "::endgroup::"
import itertools
import json
from ibm_granite_community.notebook_utils import get_env_var, wrap_text
from transformers import AutoTokenizer
from typing import Iterator, Callable
from langchain_ibm import ChatWatsonx
from ibm_granite_community.langchain.prompts import TokenizerChatPromptTemplate
from docling.document_converter import DocumentConverter
from docling_core.transforms.chunker.hierarchical_chunker import HierarchicalChunker
from docling_core.transforms.chunker.base import BaseChunk
from rouge_score import rouge_scorer
from bert_score import score
WATSONX_APIKEY = get_env_var('WATSONX_APIKEY')
WATSONX_PROJECT_ID = get_env_var('WATSONX_PROJECT_ID')
URL = get_env_var("WATSONX_URL")
For this tutorial, we will use the latest Granite 4.0 model, but feel free to use another provider for inference like OpenAI and one of their GPT models.
model = ChatWatsonx(
model_id="ibm/granite-4-h-small",
apikey=WATSONX_APIKEY,
url=URL,
project_id=WATSONX_PROJECT_ID,
params={
"min_tokens": 200,
"max_tokens": 5000,
"temperature": 0.8
}
)
The code in the following sections will fetch the source text, “Alice’s Adventures in Wonderland,” from Project Gutenberg for text summarization.
We will then chunk the book text so that the chunks fit in the context window size of the AI model.
Before sending our book chunks to the AI model, it's crucial to understand how much of the model's capacity we're using. LLMs typically have a limit on the number of tokens that they can process in a single request.
Key points:
—We're using the Granite 4.0 model that has a context window of at least 128,000 tokens.
—Tokenization can vary between models, so we use the specific tokenizer for our chosen model.
Understanding token count helps us optimize our prompts and ensure we're using the model efficiently.
tokenizer = AutoTokenizer.from_pretrained("ibm-granite/granite-4.0-h-small")
HierarchicalChunker to chunk the content and create separate documentsNext, we use Docling's HierarchicalChunker to understand the document's structure, chunk the book into text passages and group the text passages by chapter that we can then summarize.
# Created this global flag to establish when the last chapter is found while processing the doc
found_last_chapter = False
def clean_headings(headings: list[str]) -> list[str]:
# Strip extra whitespace and normalize line breaks
return [" ".join(h.split()) for h in headings]
def chunk_document(source: str, *, dropwhile: Callable[[BaseChunk], bool] = lambda c: False, takewhile: Callable[[BaseChunk], bool] = lambda c: True) -> Iterator[BaseChunk]:
"""Read the document and perform a hierarchical chunking"""
converter = DocumentConverter()
chunks = HierarchicalChunker().chunk(converter.convert(source=source).document)
return itertools.takewhile(takewhile, itertools.dropwhile(dropwhile, chunks))
def merge_chunks(chunks: Iterator[BaseChunk], *, headings: Callable[[BaseChunk], list[str]] = lambda c: c.meta.headings) -> Iterator[dict[str, str]]:
"""Merge chunks having the same headings"""
prior_headings: list[str] | None = None
document: dict[str, str] = {}
doc_id = 0
for chunk in chunks:
text = chunk.text.replace('\r\n', '\n')
current_headings = headings(chunk)
if prior_headings != current_headings:
if document:
yield document
prior_headings = current_headings
document = {
'doc_id': str(doc_id:=doc_id+1),
'title': " - ".join(current_headings),
'text': text
}
else:
document['text'] += f"\n\n{text}"
if document:
yield document
def chunk_dropwhile(chunk: BaseChunk) -> bool:
"""Skip content before the first chapter."""
headings = [h.upper() for h in chunk.meta.headings]
return not any(h.startswith("CHAPTER I") for h in headings)
def chunk_takewhile(chunk: BaseChunk) -> bool:
global found_last_chapter
headings = [h.upper() for h in chunk.meta.headings]
if any(h.startswith("CHAPTER XII") for h in headings):
if not found_last_chapter:
found_last_chapter = True
return True # To include CHAPTER XII
else:
return False
return not found_last_chapter
def chunk_headings(chunk: BaseChunk) -> list[str]:
"""Use only the chapter heading as the title."""
for heading in chunk.meta.headings:
if heading.upper().startswith("CHAPTER"):
return [heading.strip()]
return []
documents: list[dict[str, str]] = list(merge_chunks(
chunk_document(
"https://www.gutenberg.org/cache/epub/11/pg11-images.html",
dropwhile=chunk_dropwhile,
takewhile=chunk_takewhile,
),
headings=chunk_headings,
))
print(f"{len(documents)} documents created")
print(f"Max document size: {max(len(tokenizer.tokenize(document['text'])) for document in documents)} tokens")
Here, we define a method to generate a response by using the list of documents we created and a user prompt about the documents.
We create the prompt according to the Granite Prompting Guide and provide the documents by using the documents parameter.
prompt_template = TokenizerChatPromptTemplate.from_messages(
[
("system", """You are a helpful assistant with access to the following documents. You may use one or more documents to assist with the user query.
You are given a list of documents within <documents></documents> XML tags:
<documents>{context}
</documents>
Write the response to the user's input by strictly aligning with the facts in the provided documents. If the information needed to answer the question is not available in the documents, inform the user that the question cannot be answered based on the available data."""),
("user", "{user_prompt}"),
],
tokenizer=tokenizer,
)
def generate(user_prompt: str, documents: list[dict[str, str]]):
"""Use the chat template to format the prompt"""
context = "\n".join(json.dumps(document) for document in documents)
prompt = prompt_template.format_prompt(user_prompt=user_prompt, context=context)
print(f"Input size: {len(tokenizer.tokenize(prompt.to_string()))} tokens")
output = model.invoke(prompt)
print(f"Output size: {len(tokenizer.tokenize(output.text()))} tokens")
return output.text()
The LLM will begin the iterative process of summarizing the text for each chapter. In this specific use case, there will be twelve chapter summaries and it can take a few minutes.
if get_env_var('GRANITE_TESTING', 'false').lower() == 'true':
documents = documents[:5] # shorten testing work
user_prompt = """\
Using only the book chapter document, compose a summary of the book chapter.
Your response should only include the summary. Do not provide any further explanation."""
summaries: list[dict[str, str]] = []
for i, document in enumerate(documents):
print(f"============================= {document['title']} ({i+1}/{len(documents)}) =============================")
output = generate(user_prompt, [document])
summaries.append({
'doc_id': document['doc_id'],
'title': document['title'],
'text': output
})
print("Summary count: " + str(len(summaries)))
Next, we need to summarize the chapter summaries. We prompt the model to create a unified summary of the chapter summaries we previously generated.
user_prompt = """\
Using only the book chapter summary documents, compose a single, unified summary of the book.
Your response should only include the unified summary. Do not provide any further explanation."""
output = generate(user_prompt, summaries)
print(wrap_text(output))
Now, we have summarized a large document, one that exceeds the AI model's context window by dividing it into smaller sections, summarizing each piece and then combining those LLM summarizations into a comprehensive summary.
When it comes to an LLM summarization task, we want to ensure that our output is not only readable, but accurate and concise. Evaluation metrics like ROUGE score (short for recall-oriented understudy for gisting evaluation) and BLEU score (bilingual evaluation understudy), both traditional statistical metrics, compare the content of machine-generated summaries. They compare them against one or more reference summaries to assess the summarization quality. BERTscore, however, is an LLM-based metric. Unlike ROUGE and BLEU, it uses contextual embeddings from pretrained models, like BERT, to measure semantic similarity between generated and reference texts.
Let’s see how our model did with the text summarization task. First, let’s add a reference summary and run metrics by using ROUGE. This reference summary is taken from spark notes and adds the element of human evaluation to the ROUGE score.
# Example: Use the final summary as 'generated_summary'
generated_summary = output
# Provide your reference summary here (replace with your own)
reference_summary = """
Alice sits on a riverbank on a warm summer day, drowsily reading over her sister’s shoulder, when she catches sight of a White Rabbit in a waistcoat running by her. The White Rabbit pulls out a pocket watch, exclaims that he is late, and pops down a rabbit hole. Alice follows the White Rabbit down the hole and comes upon a great hallway lined with doors. She finds a small door that she opens using a key she discovers on a nearby table. Through the door, she sees a beautiful garden, and Alice begins to cry when she realizes she cannot fit through the door. She finds a bottle marked “DRINK ME” and downs the contents. She shrinks down to the right size to enter the door but cannot enter since she has left the key on the tabletop above her head. Alice discovers a cake marked “EAT ME” which causes her to grow to an inordinately large height. Still unable to enter the garden, Alice begins to cry again, and her giant tears form a pool at her feet. As she cries, Alice shrinks and falls into the pool of tears. The pool of tears becomes a sea, and as she treads water she meets a Mouse. The Mouse accompanies Alice to shore, where a number of animals stand gathered on a bank. After a “Caucus Race,” Alice scares the animals away with tales of her cat, Dinah, and finds herself alone again.
Alice meets the White Rabbit again, who mistakes her for a servant and sends her off to fetch his things. While in the White Rabbit’s house, Alice drinks an unmarked bottle of liquid and grows to the size of the room. The White Rabbit returns to his house, fuming at the now-giant Alice, but she swats him and his servants away with her giant hand. The animals outside try to get her out of the house by throwing rocks at her, which inexplicably transform into cakes when they land in the house. Alice eats one of the cakes, which causes her to shrink to a small size. She wanders off into the forest, where she meets a Caterpillar sitting on a mushroom and smoking a hookah (i.e., a water pipe). The Caterpillar and Alice get into an argument, but before the Caterpillar crawls away in disgust, he tells Alice that different parts of the mushroom will make her grow or shrink. Alice tastes a part of the mushroom, and her neck stretches above the trees. A pigeon sees her and attacks, deeming her a serpent hungry for pigeon eggs.
Alice eats another part of the mushroom and shrinks down to a normal height. She wanders until she comes across the house of the Duchess. She enters and finds the Duchess, who is nursing a squealing baby, as well as a grinning Cheshire Cat, and a Cook who tosses massive amounts of pepper into a cauldron of soup. The Duchess behaves rudely to Alice and then departs to prepare for a croquet game with the Queen. As she leaves, the Duchess hands Alice the baby, which Alice discovers is a pig. Alice lets the pig go and reenters the forest, where she meets the Cheshire Cat again. The Cheshire Cat explains to Alice that everyone in Wonderland is mad, including Alice herself. The Cheshire Cat gives directions to the March Hare’s house and fades away to nothing but a floating grin.
Alice travels to the March Hare’s house to find the March Hare, the Mad Hatter, and the Dormouse having tea together. Treated rudely by all three, Alice stands by the tea party, uninvited. She learns that they have wronged Time and are trapped in perpetual tea-time. After a final discourtesy, Alice leaves and journeys through the forest. She finds a tree with a door in its side, and travels through it to find herself back in the great hall. She takes the key and uses the mushroom to shrink down and enter the garden.
After saving several gardeners from the temper of the Queen of Hearts, Alice joins the Queen in a strange game of croquet. The croquet ground is hilly, the mallets and balls are live flamingos and hedgehogs, and the Queen tears about, frantically calling for the other player’s executions. Amidst this madness, Alice bumps into the Cheshire Cat again, who asks her how she is doing. The King of Hearts interrupts their conversation and attempts to bully the Cheshire Cat, who impudently dismisses the King. The King takes offense and arranges for the Cheshire Cat’s execution, but since the Cheshire Cat is now only a head floating in midair, no one can agree on how to behead it.
The Duchess approaches Alice and attempts to befriend her, but the Duchess makes Alice feel uneasy. The Queen of Hearts chases the Duchess off and tells Alice that she must visit the Mock Turtle to hear his story. The Queen of Hearts sends Alice with the Gryphon as her escort to meet the Mock Turtle. Alice shares her strange experiences with the Mock Turtle and the Gryphon, who listen sympathetically and comment on the strangeness of her adventures. After listening to the Mock Turtle’s story, they hear an announcement that a trial is about to begin, and the Gryphon brings Alice back to the croquet ground.
The Knave of Hearts stands trial for stealing the Queen’s tarts. The King of Hearts leads the proceedings, and various witnesses approach the stand to give evidence. The Mad Hatter and the Cook both give their testimony, but none of it makes any sense. The White Rabbit, acting as a herald, calls Alice to the witness stand. The King goes nowhere with his line of questioning, but takes encouragement when the White Rabbit provides new evidence in the form of a letter written by the Knave. The letter turns out to be a poem, which the King interprets as an admission of guilt on the part of the Knave. Alice believes the note to be nonsense and protests the King’s interpretation. The Queen becomes furious with Alice and orders her beheading, but Alice grows to a huge size and knocks over the Queen’s army of playing cards.
All of a sudden, Alice finds herself awake on her sister’s lap, back at the riverbank. She tells her sister about her dream and goes inside for tea as her sister ponders Alice’s adventures.
"""
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
scores = scorer.score(reference_summary, generated_summary)
print("ROUGE scores:")
for key, value in scores.items():
print(f"{key}: Precision={value.precision:.3f}, Recall={value.recall:.3f}, F1={value.fmeasure:.3f}")
Let's use the same reference summary and run metrics by using BERTScore.
# generated_summary and reference_summary should be defined as before
P, R, F1 = score([generated_summary], [reference_summary], lang="en", verbose=True)
print(f"BERTScore Precision: {P.mean():.3f}")
print(f"BERTScore Recall: {R.mean():.3f}")
print(f"BERTScore F1: {F1.mean():.3f}")
Compare the results of both benchmark methods. This approach helps you understand the pros and cons of different methods for metrics. Take some time to research and experiment with other reference summaries, there are plenty available online.
You now have the skillset to produce a high-level text summarization by using IBM Granite 4.0 and Docling. These AI-driven techniques allow you to distill vast amounts of information, making research and reading more efficient and manageable. Whether for academic, professional or personal use, leveraging this technology can transform the way you engage with large texts. In the words of Alice, "It's no use going back to yesterday, because I was a different person then."