Configuring document processors
Configure your document processors after you add them to your workflow. You can add a pre-configured utility bill extraction activity or a document extractor, if you have one in your project. You can then add a document review activity if you need manual review.
Adding a Utility bill extraction
To add a Utility bill extraction activity into a workflow:
-
From the workflow editor, under Document processors, select Utility bill extraction to add that activity into a workflow.
-
Select Define data mapping and map the variables to the expected input and output.
- Input: Enter a variable for the input document (content). Select a default language (For example, en, for English). Do not update the confidence or review_fields values. Do that in Set extraction rules.
- Output: a variable for the output structure (utility_bill).
-
Select Set Extraction rules and set the confidence thresholds that determine when to mark a document as needing review.
-
Optional. Depending on the scenario: In the Variables pane, mark the document variable (content) as input.
Adding a document extractor
You can add a document extractor skill into a workflow to extract a specific set of fields from documents.
-
From the workflow editor, under Document processors > Document extractors, select the extractor you want to add into the workflow.
If there is no document extractor in your project, you can click Create a document extractor, which creates the document extractor skill and takes you to the editor to configure it. For more information, see Creating document extractors.
-
Select Define data mapping and map the variables to the expected input and output.
- Input: Enter a variable for the input document, such as Document. Select a default language (For example, en, for English). The review_fields available are the one defined in your document extractor. If you need to modify these fields, modify them in the document extractor and re-add it to your workflow.
- Output: a variable for the output structure (document_extractor).
-
Select Set Extraction rules and select the fields as well as the confidence thresholds for these fields that determine when to mark a document as needing review.
-
Optional. Depending on the scenario: In the Variables pane, mark the document variable (document) as input, and the document_extractor variable as output.
Adding a Branch that enables the HITL activity
You can add a document review activity to trigger manual (human-in-the-loop) review. To add it as a conditional branch that gets triggered if the criteria you set are met, follow this procedure. The example here is utility bill extraction but works the same for a document extractor.
-
On the workflow editor, add a Branch after the Utility bill extraction activity.
-
Edit the condition of the branch so that utility_bill/Review needed is equal to True.
-
(Optional) Give proper names to the branch and the paths. For example,
- For the branch, you can use "needs review?".
- For the paths, you can use "Yes" or "No".
Adding Document Review activity for the Yes path
-
In the workflow editor, select Document Processor, and in the Yes path, add a Document review.
-
Set the Variable to the output variable of the Utility bill extraction activity. For example, "utility_bill".
You can now try this workflow by using the Playback button.
Updating review fields with image data
Open the Review document window on the extracted information that you want to review. Click a label in the review field (like “Account number”). As you drag the cursor, the label always displays next to the cursor. When you
move the cursor to the viewer, it becomes a crosshair image.
You can drag the crosshair to a value on any page and select a value or text. The selection is automatically populated into the review field. You can also single-click to select a word or double-click to select a phrase.
The format of the value or text that you select must match the format of the review field. If the two do not match, the selection is not populated into the review field. For example, if the review field requires a numeric value and you select text instead, the field is not updated. If possible, the selected text is transformed to the expected field format. For example, if you select "2024-01-30" for a field that expects a date, the text is transformed to a standardized date format.