Organizations are spending more and more time manually processing documents, where we can’t just blame the poor image quality of the fax machine.

Covered in this chapter

  • The rise of AI in document processing
  • Intelligent document processing is designed to extract business critical data,  enabling better, faster decision-making and driving business performance
  • Three examples of intelligent document processing
  • IBM and intelligent document processing

The Art of Automation: Table of Contents

Document processing ripe is for change

The explosion of digital content has resulted in so many variations of document formats and layouts as well as new input channels with varying quality or ability to be understood. One might be in the back seat of a ride share, trying to take a picture of a utility bill in order to apply for a time-sensitive parking permit. Or one might be exchanging emails with a patient, trying to process a healthcare claim while working from a remote home office. In 2018, Forbes stated the prior two years had generated 90% of the world’s data. One can only imagine how much that accelerated in 2020 between remote work, telemedicine, digital social engagements and more.

In addition to the explosion of digital content and input channels, existing capture technology and techniques can’t scale anymore. For example, fingerprint functionality has been used to specify recognition zones and positional information in order to extract the precise data needed on specific document formats or matches of similar kinds. However, with so many unique document formats emerging from new social or economic programs or new B2B relationships, setting these up takes time away from either closing business, improving the economy or progressing the social welfare of citizens. Additionally, separator sheets like headers or barcodes to identify components of an application are not effective when you have input coming from different channels like mobile, email and online forms.

The result is that organizations are spending more and more time manually processing documents, where we can’t just blame the poor image quality of the fax machine. A 2019 survey conducted by Levvel Research found 57% of invoice data is entered manually and 49% of invoice approvals required two to three approvers.

Embracing AI for document processing

While artificial intelligence (AI) is not new, it has been difficult for organizations to successfully use for processing of semi-structured and unstructured documents. Using AI has required significant data science skills and thousands of sample documents to train models. This, in turn, has resulted in long cycles to collect documents and data in order to realize business benefits.

However, advances in AI and simple tooling have been able to accelerate the use for document processing. First, deep learning algorithms have emerged, which begin to mimic the thinking of a human brain. These algorithms can identify valid contextual patterns to gain understanding of unstructured information (like the contents of a document) and apply that learning to things it hasn’t seen before — which is called transfer learning. This helps reduce the document collection process and long training cycles. Second, no-code tools with simple step-by-step guides make it easy for business users to train AI models, format or convert data output and customize business risk tolerance.

Three main activities of intelligent document processing

While the implementation of intelligent document processing and use of AI models may differ by vendor, the core activities remain the same:

  1. Document classification
  2. Data extraction
  3. Data output

First, document classification is the task by which you identify document types, such as invoices or tax forms. Using a set of sample documents, one can train an AI classification model on the different document types and the fields and values that correspond with those document types. This activity not only feeds into the next activity of data extraction, but also enables transfer learning for other, similar document types and facilitates better search of documents within content repositories.

Next, intelligent data extraction is the core activity whereby important, relevant information is pulled off the page. This consists of identifying key and value pairs like an account number or amount owed, defining what the data should look like and where it might be on the page and training the AI models for the relevant information within each of the different document types. In this step, there may also be metadata extracted and associated with the document in order to ease search later.

Finally, data output consists of both enriching the data extracted and creating the final output file for use downstream. AI-based models can be used to autocorrect common misspellings, convert data into standard output formats (e.g., a telephone number) and format data to look consistent (e.g., two decimal places for dollar values). The last step is to create the output file — typically a JSON file — which can then feed a workflow or push to a content repository for use later.

Data output from intelligent document processing to drive process automation

A major beneficiary of intelligent document processing is process automation, whereby structured data that has already been validated can be fed into transactions, enabling faster processing and scalable operations. For example, the manual set up of a workflow, data entry and data validation previously may have taken hours by a human worker. An integration between intelligent document processing and workflow can eliminate these manual steps, and data output can automatically be pushed into a business process. Similarly, bad data fed into an robotic process automation (RPA) bot can result in a faulty next step, which can lead to either a bottleneck or error in a business process. Leveraging the continuous output from intelligent document processing, an RPA bot can scale throughout an organization more easily. Finally, visualization dashboards can empower business users to uncover patterns and insights related to data extracted or bottlenecks in business processes, which can lead to more informed decision-making.

To learn more about the role of RPA in automation, see “The Art of Automation: Chapter 2 — Robotic Process Automation (RPA).”

Examples of intelligent document processing

There is strong evidence that there is demand for automating document processing, whereby the combination of artificial intelligence (AI) along with low-code tools will result in organizations improving worker productivity and driving business performance. 

In fact, in working with our own IBM clients, we’ve uncovered a number of use cases where intelligent document processing can be applied. We’ll walk through three use case examples below and the potential benefits an organization may realize.

  • Insurance: Account opening and servicing, personal and commercial claims
  • Government: Social services enrollment and eligibility, pension and retirement plans, permits and licenses
  • Banking: Account opening and servicing, mortgage/loan application

Quote and approval process application for commercial insurance

The quote and approval process for commercial insurance is very competitive, where the first company to respond with a quote often wins the business. The challenge is that in many insurance companies, this process requires manual review, entry of application data and reading supporting documentation, making it difficult to compete or scale. This also takes agents’ focus away from advisory services, which are needed to retain and grow existing business. Intelligent document processing can automate this process using AI with deep learning to read and classify each document type and extract the appropriate data from these different formats. The extracted data can then be connected to a workflow to accelerate business processing to produce the quote and approve the application.

Three potential benefits of applying intelligent document processing are as follows:

  1. Increased revenue from more business closed without the need
    to add staff.
  2. Improved customer experience with increased processing speeds.
  3. Retention and growth of existing customer accounts.

Social services enrollment-processing application

Enrollment for dozens of local government programs — such as food assistance or subsidized housing — require inefficient, manual spreadsheet processing as IT teams do not have resources to build the required solutions. Using low-code tools and intelligent document processing, business users can build simple, yet fit-for-purpose processing applications and train the system to recognize key fields from enrollment forms. In addition, easy-to-configure validators can ensure date fields and currency fields are accurately recognized, and simple, custom validators can also be created to handle unique fields like a social security number.

Three potential benefits of applying intelligent document processing are as follows:

  1. Increased program enrollment due to faster turnaround times.
  2. Cost effective rollout of custom automation solutions with appropriate role-based viewing of personally identifiable information.
  3. Built by business users with little to no involvement from IT.

Account servicing for personal banking

Banks can have over 20 different account servicing forms available for download from its website. Account holders use these forms to make changes to accounts or close accounts. Today, this can require a sizable team of agents to read these forms, verify the data and then enter the data into an account management system. However, with low-code tools and intelligent document processing, the bank can rapidly build solutions to process each account servicing form and use intelligent document processing to train the system on each form in order to not only recognize common fields like customer address and account number, but also unique fields to each form.

By combining with RPA, the bank can also take the extracted data and automate the changes into the bank’s backend systems. Additionally, leveraging intelligent document classification, account closing forms can quickly be flagged and agents alerted to clients that may be potential flight risks.

Three potential benefits of applying intelligent document processing are as follows:

  1. Improved customer experience with faster response times.
  2. Better customer retention with intelligent flight risk identification.
  3. Reduced retail banking costs on a per-account basis.

IBM and intelligent document processing

IBM’s approach to intelligent document processing surfaces in our IBM Cloud Pak® for Business Automation. A cloud-native solution, Automation Document Processing is a set of AI-powered services that automatically reads and corrects data from documents. A document processing designer provides an easy-to-use no-code interface for training models on document classification, data extraction and data enrichments.

Figure 1: Automation document processing.

In addition, IBM provides document processing application templates that can be used for processing either single-page documents or batches of documents. Toolkits in the Application Designer can also be used to customize the end-user application to look and feel like other applications within an organization. Finally, IBM provides simple deployment tools and an out-of-the-box integration with its content services capabilities, IBM FileNet Content Manager, for both storing the document(s) and data output file.

The future of intelligent document processing

While this chapter gave an overview of how document processing has been ripe for change and where AI is playing a major part in advancing document processing, there is more innovation to come in this space. There are two key areas, in particular, to keep an eye out on. First, as the formats and structures of semi-structured and unstructured documents continue to explode, AI models will need to keep up. From reading highly complex table structures to processing government issued IDs with holograms or watermarks, AI models will be challenged to remain accurate.

Second, while this space has been coined intelligent document processing, video and audio file types are on the rise. It is only a matter of time before these file types are in the critical path for processing of insurance claims or filing of police incident reports.

Stick around for the ride, it is sure to be exciting.

Learn more

Make sure you check out The Art of Automation podcast, especially Episode 7, in which I sit down with Jerry Cuomo to discuss intelligent document processing.

Check out the other chapters in the ongoing series, The Art of Automation:

The Art of Automation: Landing Page

Was this article helpful?

More from Automation

How an AI Gateway provides leaders with greater control and visibility into AI services 

2 min read - Generative AI is a transformative technology that many organizations are experimenting with or already using in production to unlock rapid innovation and drive massive productivity gains. However, we have seen that this breakneck pace of adoption has left business leaders wanting more visibility and control around the enterprise usage of GenAI. When I talk with clients about their organization’s use of GenAI, I ask them these questions: Do you have visibility into which third-party AI services are being used across…

Introducing IBM MQ version 9.4: Built for change

4 min read - We live in a world where businesses must be able to respond to change rapidly, whether it is to meet changing customer expectations or to take advantage of technology shifts that, while disruptive, offer the ability to surpass competitors. This is often at odds with the continual pressures that businesses face around reducing risk and costs across IT operations. To perform well against these diverse challenges, businesses must have an architectural foundation that: is stable and robust to reduce risk…

IBM Hybrid Cloud Mesh and Red Hat Service Interconnect: A new era of app-centric connectivity 

2 min read - To meet customer demands, applications are expected to be performing at their best at all times. Simultaneously, applications need to be flexible and cost effective, and therefore supported by an underlying infrastructure that is equally reliant, performant and secure as the applications themselves.   Easier said than done. According to EMA's 2024 Network Management Megatrends report only 42% of responding IT professionals would rate their network operations as successful.   In this era of hyper-distributed infrastructure where our users, apps, and data…

IBM Newsletters

Get our newsletters and topic updates that deliver the latest thought leadership and insights on emerging trends.
Subscribe now More newsletters