CreateDocuments

Arranges the contents of a batch into documents based on the Document Integrity rules.

Syntax

bool CreateDocuments ()

Parameters

None.

Returns

True if successful. Otherwise, False .

This action will not create a document structure if a document structure already exists. If there is a need to re-create the document structure, first remove the document structure then call this action.

Level

Batch level.

Details

This action assembles all of the pages within the batch to a hierarchy of Batch->Document->Page. Pages within the batch are grouped together based on the document integrity rules based in the settings in the Setup DCO. The special document and page level variables/properties "min", "max" and "order", along with the order of the pages within the batch, are used to determine how many pages, and which pages, belong under a single document node. The rules for how the pages become arranged can be complex and these variables must be properly set in the Setup DCO. The result is that the pages are arranged in the runtime DCO in the manner specified.

Initially pages are ingested to a batch with a page type of "Other", which is a page without a document structure and each page is attached directly to the batch node. After ingestion, images could be manipulated such as splitting single page tifs out of a PDF or from a multi-page tif. The parent fields may then be marked as deleted so they are no longer processed, although they can still exist in the batch for export at the end of the process. Images may then have image enhancements applied to clean them up and make them consistent for the page identification step.

Next page identification would occur. Rules run on each page of type "Other". These custom rules then determine the actual page type of the document. It could be determined by matching an image fingerprint to assign a the real type, reading a barcode on the actual page or on the preceding page, to determine the correct type, other types of automated tests can be performed, or pages can be manually identified with the Flex ID task. Once the correct type for a page is determined, the page type is changed from "Other" to the correct type. This is important as the next step will be to create the document structure. The CreateDocuments action will look at these assigned page types

and generate the desired object hierarchy based on the order, page types, and the special "min", "max" and "order" variables.

A high-level example of how this action can work would can be illustrated with the foundation APT application. In this application, a single document can contain one "Main_Page" type followed by any number of "Trailing_Page" types. Separators can also be optionally used to note the start and end of documents. The order in which the pages exist within the batch is also very important as it determines how pages are associated with the same document. A document can consist of only 1 Main_Page and no trailing pages or it can have any number of trailing pages. As CreateDocuments runs, it will look at each page type, the controlling variables and organize each page under a document. The first document may have under it 1 Main_Page and 2 Trailing_Page. This is a 3

page document. The next page type was Main_Page and since only 1 is allowed, create documents created a new document node and put the page in the new document node along with the trailing pages, if any. This continues until all of the pages have been arranged.

When developing an application, the runtime DCO is written to the batch directory for use by the next step in the workflow. This XML can be reviewed by the application developer to confirm that the application is configured correctly so the pages are arranged correctly under documents. While the developer can manually review the XML file, it is only for direct use by the systems within Datacap. Any direct manipulation by custom actions or other applications is not supported. If writing custom actions, the supported DCO objects and Datacap APIs can be used to manipulate the DCO and its nodes.

Batches containing existing document structures will cause this action to return False, with no affect to the existing document structure. If it is necessary to re-create the batch structure a second time, then the existing document structure needs to be removed prior to calling this action. This can be performed with the action RemoveDocumentStructure.

Note: During document creation, temporary IDs are assigned (with a different format than final Document IDs), and if the action fails, these temporary IDs may remain.

This action is applied at the Batch level, and generally in its own Ruleset such as a CreateDocuments ruleset.

Example

CreateDocuments()