Reviewing the trained model

After the initial training of the model, you can review the results for accuracy and update the model as needed.

About this task

To create the model, you specified field name location, field value location and format, and other characteristics of the field values. Training the model incorporates these parameter and characteristics. When the training is complete, you can examine the results to see whether the model has accurately captured the fields and values that you created for each document type. If the model has issues, you can update the model with accurate values.

Before Automation Document Processing can compute the accuracy of the training results, it must first know what the correct values for the extracted fields are. These correct values are known as the ground truth values. When you initially enter the Review the trained model screens, all of the training information is in a Verify results section, and no accuracy metrics are available. At this point, the sample documents are in the Needs verification state. You must review each training sample of each document type and correct any missing or incorrect values to establish the ground truth for that sample. After you enter the correct values for all of the fields, you can click Submit to move the sample document to the Verified state.

After all sample documents are in the Verified state, Automation Document Processing can calculate and display accuracy statistics for each document type, and for the overall model.

Procedure

To refine the trained model:

  1. From the main page in the Designer, on Extraction model, click Open.
  2. Click Review the trained model.
  3. On the Last training results tab, click a document type to see the results for samples of that type.
  4. For each sample that needs additional verification, in the Action column, click Start.
  5. In the verify view, check that the extracted values for the fields match the values that you find in the displayed sample document.
  6. Correct any field value that needs to be corrected.
    1. For table areas, click the edit icon. In the dialog that opens, you can see the extracted columns and the corresponding values for each row. Double-click a cell and directly type in the value, or select the cell and draw a box around the corresponding value in the document. You can add, remove, or edit rows as needed.
      To select a different table area, close the edit dialog, click the field name in the Field name column, and select the area that you want on the document. The edit dialog opens again with the new area that you selected, and you can edit the ground-truth values.
    2. For other field values, click the edit icon in the Field value column and add the correct value.
      You can draw a box directly on the document to capture the correct value, or type in the value. For Boolean settings, select the value from the drop-down choice list.
  7. When you are done, click Done.
  8. After you verify, edit, and save any changes for every field, click Submit to finalize the test sample.
  9. Repeat the verification for the remaining samples in the list.
    After you complete the verification for all the samples in a type, the document type is listed in the Results ready table. The model includes the updated accuracy and confidence for the document type.
  10. Use the Results ready information to further refine the model (the values are read-only).
    Documents that are listed in the Results ready have already been verified.
    In the Results ready table, click a document type to see more information about the samples and fields for that type.
    • Click Sample results, then click View results to see the results for a particular sample document and how the sample compares to the model.
    • Click Field results, then click View results to see the value and confidence for the field from each of the samples in the document type. You can filter to show only samples with issues. You can also filter to group fields by confidence levels.

What to do next

After you make any adjustments to your document types and fields, or add or remove any sample documents, you must retrain the field extraction model. After a model is retrained, the ground truth values are reset. You must review the values for each sample again, taking care to add values for any new fields, and any new sample documents. If there are no new sample documents or fields, and you have already entered values for all fields before retraining, then you can click Verify all to mark all samples for a document type as verified again.

If you need to produce statistics or analyze the data, you can download the training results in CSV format. Simply select the document types that you want and click Download CSV. The data for this file comes from the ontology (document type and field type definitions and associated enrichments).