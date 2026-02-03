In this tutorial, you’ll learn how to create abstractive text summaries with a local transformer model from the Hugging Face library.
Text summarization is a core task in artificial intelligence (AI) and natural language processing (NLP) that turns long, complex documents into short, easy-to-read summaries while preserving the main ideas. Modern transformer models make this task possible by understanding text, highlighting the most important ideas and generating clear, comprehensible summaries.
Abstractive summarization is a type of automatic text summarization in which a system generates new sentences that paraphrase and condense the meaning of a source text. The goal is to produce a summary that captures the core ideas by using different wording and structure, rather than copying sentences verbatim.
From a tooling perspective, abstractive summarization typically combines NLP preprocessing steps with neural language models that perform the actual generation. Traditional NLP techniques such as tokenization, sentence segmentation and word embedding representations are used to structure and encode the input text. Meanwhile, the summarization model learns how to generate new sentences from these representations.
There are two main approaches to automatic text summarization:
Extractive text summarization: Selects and copies the most important sentences directly from the original text, similar to highlighting key sentences with a marker. This method is faster and simpler to implement, but it is limited to the wording and structure of the source document. NLP techniques here typically involve sentence scoring, keyword extraction or graph-based ranking algorithms.
Abstractive text summarization: Generates new sentences that capture the core meaning of the text, much like how a human would write a summary in their own words. This approach is more flexible and natural-sounding, but it is also more computationally intensive. NLP techniques used include encoder-decoder models, attention mechanisms and contextual embeddings, which allow the system to understand relationships between words and generate new text.
This technical progress wasn’t achieved overnight. Early NLP systems focused on explicitly modeling linguistic structure. Techniques from information extraction (IE) were used to identify entities, relations and events by using hand-drafted rules or statistical models.1 During this period, most text summarization methods were extractive, selecting important sentences rather than generating new text.
Neural extractive models represent the next step. One influential example, SummaRuNNer, a recurrent neural network (RNN) based sequence model, showed that neural models can capture document-level context and outperform traditional extractive techniques.2 Early neural net models included RNN and long short-term memory (LSTM) networks that helped capture sequential dependencies across long documents. Convolutional neural networks (CNNs) were also applied to text for local syntactical feature extraction, complementing sequential models.3
The idea of abstractive summarization became more practical with the introduction of encoder-decoder neural models, which can map an input sequence to a variable-length output sequence suitable for tasks such as summarization.
In these models, the encoder processes the input text and converts it into a series of contextual representations that capture the meaning and relationships between words. The decoder generates the output sequence token by token, attending to relevant parts of the input through attention mechanisms to ensure coherence and to preserve information. This structure allows the model to produce entirely new sentences rather than relying on predefined templates or extracted facts.
In recent years, state-of-the-art transformer-based models have achieved strong results on large datasets such as Gigaword or collections of news articles (CNN/DailyMail training data). Pretraining on large corpora enabled these models to generalize across domains and produce fluent summaries.
Some systems incorporate a knowledge base or learning-based lexical modules to improve factual correctness and contextual understanding, particularly in specialized domains. These ideas are closely related to retrieval-augmented generation (RAG) approaches, where a model can retrieve relevant documents or facts from an external source and then generate abstractive summaries that integrate this information.
More broadly, abstractive summarization underlies many modern applications, from RAG-based QA systems to automated report generation, demonstrating its role as a building block in practical AI systems.
Abstractive text summarization is a form of document summarization, closely related to tasks like machine translation and natural language generation. Earlier syntactic text summarization techniques relied on grammatical rules, whereas modern approached leverage neural architectures for summary generations and rewriting.
Abstractive summarization relies on advanced language models such as BART, T5 or PEGASUS, which are implemented as sequence-to-sequence (seq2seq) transformer models. These models transform input documents into numerical representations that capture contextual meaning, then generate concise summaries that convey the same ideas in new words.
The summarization process begins with tokenization, where the individual words are split into tokens (words or subwords). These tokens are converted into numerical representations and processed by the encoder, which uses self-attention to understand how different parts of the text relate to each other. Self-attention allows the model to weigh the importance of each token relative to every other token in the sequence. This way, the model can capture long-range dependencies and contextual relationships across the document. The encoder produces contextual representations that capture the document’s information, which the decoder then uses to generate the final summary.
The decoder generates the summary token by token, by using the encoder’s contextual representations and attention mechanisms to focus on the most relevant parts of the input. It also considers previously generated tokens to maintain coherence. Some models might directly copy certain words or phrases from the input, which is useful for names, numbers or technical terms.
By combining these techniques, the model produces human-like summaries that paraphrase and condense the original text instead of copying it verbatim.
Modern abstractive summarization is dominated by transformer-based sequence-to-sequence models, which treat summarization as a generation task: given an input sequence (the corpus or document), the model generates an output sequence (the summary).
BART is a transformer-based encoder-decoder model designed for text generation tasks. Its encoder is bidirectional, meaning that it reads the entire input sequence both left-to-right and right-to-left, allowing it to fully understand context around each word.
BART is pretrained by using denoising objectives, where the model learns to reconstruct original text from corrupted versions (for example, with masked tokens, deleted spans or shuffled sentences). This pretraining strategy makes BART well-suited for abstractive summarization tasks, and fine-tuned BART models achieve strong performance on standard benchmarks.4
While this tutorial example uses BART, several other transformer-based seq2seq models are commonly used for abstractive summarization:
To get hands on and run this project, clone the GitHub repository by using https://github.com/IBM/ibmdotcom-tutorials as the
HTTPS URL. For detailed steps on how to clone a repository, refer to the GitHub documentation. You can find this specific tutorial inside the ibmdotcom-tutorials repo under the generative AI directory.
This tutorial uses a Jupyter Notebook to demonstrate abstractive text summarization with pretrained transformer models from HuggingFace. Jupyter Notebooks are versatile tools that allow you to combine code, text and visualization in a single environment. You can run this notebook in your local IDE or explore cloud-based options like watsonx.ai® Runtime, which provides a managed environment for running Jupyter Notebooks.
Before we run our abstractive summarization example, we need to install a few Python libraries from Hugging Face and PyTorch. These libraries provide the tools and pretrained models needed to process text, run neural networks and generate summaries.
The pipeline function from the Hugging Face Transformers library is a ready-to-use interface for running machine learning models on common tasks. For abstractive summarization, it automatically loads the proper tokenizer and pretrained model for summarizing text and handles all the steps from preprocessing to output generation. This function allows us to generate summaries with just a few lines of code.
This step sets up the summarization pipeline by using the pretrained BART model.
The summarizer object now contains the tokenizer, model and all necessary post-processing so you can input text and get a summary. This cell might take a few minutes to download the model.
Other summarization models in the Hugging Face Transformer library can be used by changing the model argument. Different models might vary in speed, summary length and output style.
Let’s start with a simple example of abstractive summarization itself. You can replace this with any text you want, such as an article excerpt, a blog post or your own notes. For best results, keep it readable.
This variable text will be passed into the summarization pipeline in the next step.
In this step, we pass our prepared text (text) into the summarizer pipeline. The model reads the input text, identifies the main points and generates a shorter version in its own words.
The max_length and
The summarization pipeline always returns a list of results so it can handle multiple input texts at once. Because we provide a single input here, we extract the first result from the list and print the generated text, the summary.
An example of the kind of output that you might see after running the summarization pipeline on the original text is:
This summary illustrates abstractive summarization because the model does not simply copy sentences from the input. Instead, it paraphrases and condenses the original context, expressing the main ideas by using different wording and sentence structure. While the meaning is preserved, the phrasing is new.
Try modifying the input text or adjusting the parameters to see how the summary changes. You can also experiment with longer paragraphs or different writing styles to observe how the model adapts.
The
For example, this example is the output generated from the same input text, but with
This tradeoff between consistency (
While models like BART or T5 generate fluent and concise summaries, they have some practical limitations:
Research has proposed methods to address these issues. For example, Zhang et al. (2020) developed methods to measure and optimize factual correctness, making summaries more reliable, particularly in domains like radiology reports.7
Comparing generated outputs to a reference summary is essential, using evaluation metrics, or automatic evaluation of summaries to assess quality. Common evaluation metrics include ROUGE (measuring overlap of n-grams between generated reference summaries), BLEU (originally developed for machine translation) and METEOR (which considers synonymy and stemming). These metrics provide quantitative ways to evaluate how well the generated summary preserves content and meaning.
In this notebook, we explored abstractive text summarization and how modern transformer-based models can generate concise, human-like summaries by understanding and rephrasing the original text. Using the Hugging Face pipeline API and a pretrained BART model, we were able to move from raw input text to a meaningful summary with just a few lines of code.
Unlike extractive summarization, which selects and reuses existing sentences, abstractive summarization creates new text that captures the core ideas of the source. This method makes it more flexible and natural-sounding, but also more computationally complex. By working through each step, you’ve seen how these systems work in practice.
If you’re interested in exploring other approaches to text summarization, in particular extractive methods, check out this Python text summarization tutorial that covers classic techniques such as Luhn, LexRank and Latent Semantic Analysis (LSA). Comparing extractive and abstractive approaches side by side can help deepen your understanding of when and why each method is used.
