The diagram above shows the two forms of the summarization pattern. The simplest form of the pattern is the Stuff variant. In this pattern:
- The contents of a document is read and 'stuffed', ie. copied in its entirety, into a LLM prompt.
- A
prompt template is commonly used to 'wrap' the content with directions
and keywords to direct the target model to generate a summary.
- The resulting prompt is submitted to a trained LLM which generates a summary in response.
The Stuff approach is great for small documents but it doesn't work for documents too large for the LLM's context window, or for collections of documents. Fortunately we have the Map-Reduce variant for these situations. In the Map phase of the variant, individual documents and/or subsections of documents are stuffed into LLM prompts using the Stuff approach. The summaries returned for the documents and/or chunks are aggregated by the application and then submitted to an LLM (4) to generate an overall summary of the larger work and/or document set. It's possible to use the same LLM can be used for the Map and Reduce phases but more often the Reduce model will need to be fine-tuned to generate aggregate summaries without losing key details.
Conceptually summarization is similar to a machine translation task: we want the LLM
to 'translate' a long document into a shorter summary. Thus
encoder-decoder models such as BART and T5 are well-suited to
summarization solutions. The majority of LLMs
suitable for summarization are trained using one or more publicly
available training sets drawn from sources such as news stories,
Wikipedia, legislation, and scientific publications but will generally
require fine-tuning before they can generate acceptable summaries for
targeted business processes and input data.
A complex business
process will typically require multiple fine-tuned models to generate
summaries for different user groups. For example, an insurance claims
process would potentially require LLMs
fine-tuned for claims summarization and routing, fraud detection and
investigation, and for summarization of reports from service provides
such as medical or engineering consultants.