What is IBM Document Processing Extension?

IBM Document Processing Extension is designed to add-on to existing solutions like Datacap to bring the latest AI-based technology to document processing applications.

IBM Document Processing Extension uses AI and machine learning to classify, recognize, and extract metadata from business documents to support use cases such as accounts payable, claims processing and any business process that is driven by documents and data.

Datacap has an out of the box integration to Document Processing Extension that enables Datacap applications to send documents to Document Processing Extension for processing and then return the results to Datacap for user verification and data export steps. Document Processing Extension works transparently for the verification users, but enables a simpler setup process for Datacap developers.

Document Processing Extension has a small footprint and runs on Docker with Docker Swarm as the orchestration tool, enabling relatively simple deployment. For more information, see Installing IBM Document Processing Extension.

Licensing

Document Processing Extension is sold on a per page processed basis. You can deploy Document Processing Extension to any number and size of hardware environments and have any number of users. Purchase enough page licenses to cover the pages processed in Document Processing Extension. You can monitor the page usage in Document Processing Extension. Work with your sales team to ensure that you always have enough page licenses to stay ahead of the page consumption.

Features

Simplified document processing: Document Processing Extension is a stand-alone processing extension for capture application users.
Low-cost document processing: Capture application users can have the option to use Document Processing Extension instead investing on document processing on the cloud.
Deploy on Docker Swarm: As simplified and low-cost option, you can deploy on Docker Swarm environment an alternative of Kubernetes easily on a single Linux machine.

Document Processing Designer for document processing

You use the Designer interface to create a set of document types and related fields that comprise your Document Processing project. Document Processing Designer combines an intuitive interface with a set of AI and deep learning tools that identify and learn the document types that matter to your organization. For each document type, you designate which pieces of information to extract as data for that document to be used by downstream applications. You can also apply tools to clean up and standardize the data as it is extracted.

You choose which documents you want to process, for example, invoices. You collect multiple electronic samples of invoices to create a document classification model. The Document Processing Designer uses this set to train the model to recognize documents as invoices.

An invoice can contain many data points. But what are the important pieces of data that is useful to you for records, searching, and integration with downstream applications?

Total amount
Vendor
Date of transaction
Product ID
Customer name
Invoice number

You use Document Processing Designer for creating a data extraction model by teaching the field location on the document, naming the field, and creating a method for collecting and enriching the value of the field for each document. Again, you use sample documents to train the data extraction model.

Connecting to Datacap

Once your Document Processing Extension application is ready, you can integrate it with Datacap using the open source ADP Connector, which works for both IBM Automation Document Processing and Document Processing Extension.