Configuring document processors

Configure your document processors after you add them to your workflow. You can add a pre-configured utility bill extraction activity or a document extractor, if you have one in your project. You can then add a document review activity if you need manual review.

Adding a Utility bill extraction

To add a Utility bill extraction activity into a workflow:

  1. From the workflow editor, under Document processors, select Utility bill extraction to add that activity into a workflow. Utililty Bill Extract into workflow

  2. Select Define data mapping and map the variables to the expected input and output.

    • Input: Enter a variable for the input document (content). Select a default language (For example, en, for English). Do not update the confidence or review_fields values. Do that in Set extraction rules.
    • Output: a variable for the output structure (utility_bill). Utility Bill Extraction input mapping Utility Bill Extraction output mapping
  3. Select Set Extraction rules and set the confidence thresholds that determine when to mark a document as needing review. Utility Bill Extract to set extraction rules

  4. Optional. Depending on the scenario: In the Variables pane, mark the document variable (content) as input.

Adding a document extractor

You can add a document extractor skill into a workflow to extract a specific set of fields from documents.

  1. From the workflow editor, under Document processors > Document extractors, select the extractor you want to add into the workflow.

    If there is no document extractor in your project, you can click Create a document extractor, which creates the document extractor skill and takes you to the editor to configure it. For more information, see Creating document extractors. Add document extractor to workflow

  2. Select Define data mapping and map the variables to the expected input and output.

    • Input: Enter a variable for the input document, such as Document. Select a default language (For example, en, for English). The review_fields available are the one defined in your document extractor. If you need to modify these fields, modify them in the document extractor and re-add it to your workflow.
    • Output: a variable for the output structure (document_extractor). Doc extractor in workflow input and output mapping
  3. Select Set Extraction rules and select the fields as well as the confidence thresholds for these fields that determine when to mark a document as needing review. Doc extractor in workflow set extraction rules

  4. Optional. Depending on the scenario: In the Variables pane, mark the document variable (document) as input, and the document_extractor variable as output.

Adding a Branch that enables the HITL activity

You can add a document review activity to trigger manual (human-in-the-loop) review. To add it as a conditional branch that gets triggered if the criteria you set are met, follow this procedure. The example here is utility bill extraction but works the same for a document extractor.

  1. On the workflow editor, add a Branch after the Utility bill extraction activity. Utility Bill Extract to set HITL

  2. Edit the condition of the branch so that utility_bill/Review needed is equal to True. Utility Bill Extract review needed

  3. (Optional) Give proper names to the branch and the paths. For example,

    • For the branch, you can use "needs review?".
    • For the paths, you can use "Yes" or "No".

Adding Document Review activity for the Yes path

  1. In the workflow editor, select Document Processor, and in the Yes path, add a Document review. Utility Bill Extract set document review

  2. Set the Variable to the output variable of the Utility bill extraction activity. For example, "utility_bill". Utility Bill Extract set variable

You can now try this workflow by using the Playback button.

Updating review fields with image data

Open the Review document window on the extracted information that you want to review. Click a label in the review field (like “Account number”). As you drag the cursor, the label always displays next to the cursor. When you move the cursor to the viewer, it becomes a crosshair image. Review document cursor label

You can drag the crosshair to a value on any page and select a value or text. The selection is automatically populated into the review field. You can also single-click to select a word or double-click to select a phrase. Review document select image field

The format of the value or text that you select must match the format of the review field. If the two do not match, the selection is not populated into the review field. For example, if the review field requires a numeric value and you select text instead, the field is not updated. If possible, the selected text is transformed to the expected field format. For example, if you select "2024-01-30" for a field that expects a date, the text is transformed to a standardized date format.