Combining IBM RPA and IBM Datacap to automate desktop tasks that require data extraction form documents

How to use RPA and data capture technology together to automate data entry form documents

Paul Pacholski, IBM Canada Lab (

Larissa Auberger, IBM Germany (


To stay ahead in a highly competitive environment, businesses focus on speed, savings and customer experience. In this paper we show an effective approach on how these objectives can be facilitated in processing sales orders. A solution that leverages IBM Datacap and IBM Robotic Process Automation with Aautomation Anywhere (IBM RPA) accelerates sales order processing by eliminating manual touch points, minimizes processing costs by reducing errors and improves relationships with customers by removing delays in shipping orders.

As an example, in this article we describe how to automate SAP sales order processing. The automation is achieved in two steps:

  1. IBM Datacap is used to automatically extract data from a variable format documents
  2. An RPA software robot is used to automatically enter the structured sales order data to SAP

As-is Order Processing Scenario

Susan is an assistant in a distribution plant of a major company. Every day, Susan receives sales order documents from the sales staff. The sales orders describe the order details, include customer information and shipping addresses, and lay out conditions and agreements to be met.

For each of the sales orders, Susan manually transfers the data from the sales order letter to the SAP application. This manual process may include time-consuming and tedious activities such as searching for SAP sales organization codes or customer location codes not found in the sales order letter. It may take way too long before Susan – frustrated and annoyed – finally manages to create an electronic sales order in SAP and to kick off delivery processes.

Moreover, as the company grows and new products are launched, more and more sales orders are coming in. At some point, Susan reaches the limit of her capacity. As a result, she does not meet performance expectations and order entry mistakes are creeping in.

To-be Order Processing Scenario

To optimize performance, improve task management, and support the company’s growth, let’s look how Susan’s work can be automated in two stages.

Stage 1: Entering IBM Datacap

IBM Datacap streamlines the capturing and recognition of documents. Among various methods, it uses natural language processing, text analysis and machine learning techniques to extract information from unstructured documents of variable formats. In addition to scanning, it also automates the acquisition and processing of image or electronically-generated documents by monitoring multiple input channels such as standard file systems, Box folders, FileNet repositories, email inboxes, faxes, or multi-function devices.

In our scenario, this is done in several steps:

  • Mailroom operator initiates the order processing by scanning incoming sales orders.
  • Scanned documents are then submitted to IBM Datacap to be processed by the automatic data extraction.
  • IBM Datacap provides high quality recognition to capture content of the scanned document. It analyzes documents and extracts unstructured content to automatically gather order related information.
  • IBM Datacap handles different order document formats that may highly vary.
  • IBM Datacap includes human intervention in case extracted data needs to be reviewed or verified. Machine learning techniques improve recognition in subsequent scans.
  • Company codes are automatically prepopulated by using queries against the CRM.
  • Extracted and (if needed) verified order details are exported to a MS Excel file as structured data.

Stage 2: Entering IBM RPA

IBM RPA robots are designed to perform a single unit of work. In our scenario, an RPA bot is triggered when a MS Excel file is created in a specified directory. The RPA bot creates sales orders in SAP – just as a human would do this, but faster and with no errors:

  • The RPA bot opens the MS Excel file.
  • It starts the SAP GUI and logs in to the system.
  • The bot transfers sales order information from the MS Excel file to SAP sales order transaction screens.
  • It records the order number back to the MS Excel file for each created order.
  • Finally, the bot closes the SAP GUI and moves the updated MS Excel file to another folder to indicate orders have been created.


In this article we showed how businesses can benefit when IBM Datacap works in concert with IBM RPA.
As an example, we had a closer look at a sales order process. In this scenario IBM Datacap is used to intelligently extract order data from unstructured order letters and IBM RPA is leveraged to automatically create orders in SAP.
As you have seen, this combination limits human intervention on tasks requiring judgement. The automated approach for processing sales orders not only makes humans happy and frees up their time for more important tasks, but also facilitates increases of incoming orders and thus supports growth.

Use case scenarios aren’t limited to just sales order processing. Consider using a combined IBM Datacap and IBM RPA solution any time you have a process that starts with an unstructured document followed by structured steps to update applications or system or records.

How do I learn more about the products used in the solution?

Learn more:

    2 responses to “Combining IBM RPA and IBM Datacap to automate desktop tasks that require data extraction form documents”

    Leave a Reply

    Your email address will not be published.