NormalizeCCO
The NormalizeCCO action sorts and filters the words and lines in a Fingerprint file (.cco) that are created by a Recognition engine, for use by navigation and pattern match actions. This action is only required after full page recognition by an OCR or ICR action that does not automatically normalize the CCO.
Member of namespace
Cco2ccoSyntax
bool NormalizeCCO()
Returns
Always True.Level
Page.Details
This action sorts the words and lines in a Fingerprint file (.cco) created by a recognition engine, for use by navigation and pattern match actions. The action is called by full-page recognition actions for ICR/C, OCR/S, and OCR/A. This action must always be called before the Locate actions or the pat_RecogMatch_ID action are used to find recognized text on a page.In this context, the fingerprint is calculated for a particular image in a batch, not in the Fingerprint database that contains fingerprints for various page types and layout variations that are defined for a particular application.
There are two types of Fingerprint files. One type is based on the image geometry. The second type is based on recognized text. The AnalyzeImage action creates a geometric fingerprint that contains lines and words that are based only on the black pixels in the image. Full-page recognition actions, such as RecognizePageOCR_S, RecognizePageICR_C, RecognizePageOCR_A, create a fingerprint that is based on the results of recognition; that is, both geometry and text of the recognized characters, words and lines.
In Recognition-based fingerprints, the order of lines and words might appear to be arbitrary, especially if the page contains images, tables, stamps, or blocks of text with varying font sizes. This can cause unpredictable results from Locate actions that navigate geometrically. The word-matching and phrase-matching action pat_RecogMatch_ID also requires well-ordered text to work reliably.
The NormalizeCCO action reorders the words of text in a Recognition-based fingerprint into lines and words in standard reading order, from top to bottom and left to right.
If the AnalyzeImage action is called before full-page recognition, the recognized text is placed into the geometry that is created by AnalyzeImage. This hybrid Fingerprint file is not always suitable for cco2cco. To force creation of a pure recognition-based fingerprint, call SetFingerprintRecogPriority(True) before full-page recognition. This guarantees that any existing geometric fingerprint is ignored, and it applies to OCR_S and ICR_C only.
The full page recognition actions from the ICR_C, OCR_A, and OCR_S libraries call NormalizeCCO() automatically unless the action CCONormalization_OFF (from the Recog_Shared library) is called before recognition. The full page recognition from the OCR_SR library, however, requires that NormalizeCCO() to be called manually post recognition.
- Example:
SetFingerprintRecogPriority(True) RecognizePageOCR_S() NormalizeCCO() pat_RecogMatch_ID()