The explosion of digital content has resulted in so many variations of document formats and layouts as well as new input channels with varying quality or ability to be understood. One might be in the back seat of a ride share, trying to take a picture of a utility bill in order to apply for a time-sensitive parking permit. Or one might be exchanging emails with a patient, trying to process a healthcare claim while working from a remote home office. In 2018, Forbes stated the prior two years had generated 90% of the world’s data. One can only imagine how much that accelerated in 2020 between remote work, telemedicine, digital social engagements and more.
In addition to the explosion of digital content and input channels, existing capture technology and techniques can’t scale anymore. For example, fingerprint functionality has been used to specify recognition zones and positional information in order to extract the precise data needed on specific document formats or matches of similar kinds. However, with so many unique document formats emerging from new social or economic programs or new B2B relationships, setting these up takes time away from either closing business, improving the economy or progressing the social welfare of citizens. Additionally, separator sheets like headers or barcodes to identify components of an application are not effective when you have input coming from different channels like mobile, email and online forms.
The result is that organizations are spending more and more time manually processing documents, where we can’t just blame the poor image quality of the fax machine. A 2019 survey conducted by Levvel Research found 57% of invoice data is entered manually and 49% of invoice approvals required two to three approvers.
While artificial intelligence (AI) is not new, it has been difficult for organizations to successfully use for processing of semi-structured and unstructured documents. Using AI has required significant data science skills and thousands of sample documents to train models. This, in turn, has resulted in long cycles to collect documents and data in order to realize business benefits.
However, advances in AI and simple tooling have been able to accelerate the use for document processing. First, deep learning algorithms have emerged, which begin to mimic the thinking of a human brain. These algorithms can identify valid contextual patterns to gain understanding of unstructured information (like the contents of a document) and apply that learning to things it hasn’t seen before — which is called transfer learning. This helps reduce the document collection process and long training cycles. Second, no-code tools with simple step-by-step guides make it easy for business users to train AI models, format or convert data output and customize business risk tolerance.
While the implementation of intelligent document processing and use of AI models may differ by vendor, the core activities remain the same:
First, document classification is the task by which you identify document types, such as invoices or tax forms. Using a set of sample documents, one can train an AI classification model on the different document types and the fields and values that correspond with those document types. This activity not only feeds into the next activity of data extraction, but also enables transfer learning for other, similar document types and facilitates better search of documents within content repositories.
Next, intelligent data extraction is the core activity whereby important, relevant information is pulled off the page. This consists of identifying key and value pairs like an account number or amount owed, defining what the data should look like and where it might be on the page and training the AI models for the relevant information within each of the different document types. In this step, there may also be metadata extracted and associated with the document in order to ease search later.
Finally, data output consists of both enriching the data extracted and creating the final output file for use downstream. AI-based models can be used to autocorrect common misspellings, convert data into standard output formats (e.g., a telephone number) and format data to look consistent (e.g., two decimal places for dollar values). The last step is to create the output file — typically a JSON file — which can then feed a workflow or push to a content repository for use later.
A major beneficiary of intelligent document processing is process automation, whereby structured data that has already been validated can be fed into transactions, enabling faster processing and scalable operations. For example, the manual set up of a workflow, data entry and data validation previously may have taken hours by a human worker. An integration between intelligent document processing and workflow can eliminate these manual steps, and data output can automatically be pushed into a business process. Similarly, bad data fed into an robotic process automation (RPA) bot can result in a faulty next step, which can lead to either a bottleneck or error in a business process. Leveraging the continuous output from intelligent document processing, an RPA bot can scale throughout an organization more easily. Finally, visualization dashboards can empower business users to uncover patterns and insights related to data extracted or bottlenecks in business processes, which can lead to more informed decision-making.
To learn more about the role of RPA in automation, see “The Art of Automation: Chapter 2 — Robotic Process Automation (RPA).”
There is strong evidence that there is demand for automating document processing, whereby the combination of artificial intelligence (AI) along with low-code tools will result in organizations improving worker productivity and driving business performance.
In fact, in working with our own IBM clients, we’ve uncovered a number of use cases where intelligent document processing can be applied. We’ll walk through three use case examples below and the potential benefits an organization may realize.
The quote and approval process for commercial insurance is very competitive, where the first company to respond with a quote often wins the business. The challenge is that in many insurance companies, this process requires manual review, entry of application data and reading supporting documentation, making it difficult to compete or scale. This also takes agents’ focus away from advisory services, which are needed to retain and grow existing business. Intelligent document processing can automate this process using AI with deep learning to read and classify each document type and extract the appropriate data from these different formats. The extracted data can then be connected to a workflow to accelerate business processing to produce the quote and approve the application.
Three potential benefits of applying intelligent document processing are as follows:
Enrollment for dozens of local government programs — such as food assistance or subsidized housing — require inefficient, manual spreadsheet processing as IT teams do not have resources to build the required solutions. Using low-code tools and intelligent document processing, business users can build simple, yet fit-for-purpose processing applications and train the system to recognize key fields from enrollment forms. In addition, easy-to-configure validators can ensure date fields and currency fields are accurately recognized, and simple, custom validators can also be created to handle unique fields like a social security number.
Three potential benefits of applying intelligent document processing are as follows:
Banks can have over 20 different account servicing forms available for download from its website. Account holders use these forms to make changes to accounts or close accounts. Today, this can require a sizable team of agents to read these forms, verify the data and then enter the data into an account management system. However, with low-code tools and intelligent document processing, the bank can rapidly build solutions to process each account servicing form and use intelligent document processing to train the system on each form in order to not only recognize common fields like customer address and account number, but also unique fields to each form.
By combining with RPA, the bank can also take the extracted data and automate the changes into the bank’s backend systems. Additionally, leveraging intelligent document classification, account closing forms can quickly be flagged and agents alerted to clients that may be potential flight risks.
Three potential benefits of applying intelligent document processing are as follows:
IBM’s approach to intelligent document processing surfaces in our IBM Cloud Pak® for Business Automation. A cloud-native solution, Automation Document Processing is a set of AI-powered services that automatically reads and corrects data from documents. A document processing designer provides an easy-to-use no-code interface for training models on document classification, data extraction and data enrichments.
In addition, IBM provides document processing application templates that can be used for processing either single-page documents or batches of documents. Toolkits in the Application Designer can also be used to customize the end-user application to look and feel like other applications within an organization. Finally, IBM provides simple deployment tools and an out-of-the-box integration with its content services capabilities, IBM FileNet Content Manager, for both storing the document(s) and data output file.
While this chapter gave an overview of how document processing has been ripe for change and where AI is playing a major part in advancing document processing, there is more innovation to come in this space. There are two key areas, in particular, to keep an eye out on. First, as the formats and structures of semi-structured and unstructured documents continue to explode, AI models will need to keep up. From reading highly complex table structures to processing government issued IDs with holograms or watermarks, AI models will be challenged to remain accurate.
Second, while this space has been coined intelligent document processing, video and audio file types are on the rise. It is only a matter of time before these file types are in the critical path for processing of insurance claims or filing of police incident reports.
Stick around for the ride, it is sure to be exciting.
Make sure you check out The Art of Automation podcast, especially Episode 7, in which I sit down with Jerry Cuomo to discuss intelligent document processing.
Check out the other chapters in the ongoing series, The Art of Automation:
The Art of Automation: Landing Page