HandwritingRecognition actions
This action library provides actions to allow the user to recognize hand printed or cursive text in zone and recognize hand printed United States addresses by using a postal database.
Object Datacap.Libraries.ParascriptFx.Recognition
Default instance name: Datacap.Libraries.ParascriptFx.Recognition Handwriting recognition.
Recognition must be performed in a field and the field properties must be configured based on the type of data being recognized. Abilities and limitations are different based on the type of recognition being performed. For example, cursive text is not supported in the languages that do not support hand printed text.
For a text document that is recognized, use a lossless compression such as FAX or LZW. Do not use a lossy compression such as JPEG. JPEG is intended for photographic images. The use of a lossy compression in any step of the process for a textual document causes sharp character edges to become fuzzy, reducing recognition accuracy.
It is recommended that the guide "Best Practices for optimal text recognition in IBM Datacap" is reviewed to provided guidance on ingesting and processing images to provide optimal recognition results.
The Recognition actions cannot be used in the same application as the US check recognition. Check recognition for non-US countries can be used in the same application as the Recognition actions.
Using Handwriting actions
Handwriting recognition is called on a page object. However, recognition occurs only within fields belonging to the page object. To recognize text on a page, fields need to be configured for each item on the page to be recognized. A number of types of text are supported for recognition such as plain text, names, addresses, numeric etc.
The specific types of text that can be recognized depending on the type of text, such as constrained hand print, unconstrained hand print, or cursive text and the language.
For each field to be recognized, the action SetFieldType must be used to configure the type of recognition for that field. Each field needs to be configured individually. Each field can have its own type of recognition with unique settings. The action help for SetFieldType lists additional configuration actions that can be used on each type of field.
The action SetWritingStyle is used to configure the text type for the field. SetLanguage is used to configure the recognized language.
There are a number of additional actions that can be used to additionally configure each field such as configuring text patterns or line removal. Some of the image enhancement settings, such as line removal, are on by default. If images have pre-processed with separate image enhancement actions, which is typically done, then call the appropriate actions to disable any additional unneeded image manipulation.
Typically, a set of rules are set up to configure each field. These rules would be called on a field "open" event to configure the settings. Here is an example ruleset that might be used to recognize a first name.
- Rule Configure First Name
- Function Field Level
- SetNoiseRemoval("False")
- SetLanguage("1")
- SetDeskew("False")
- SetFieldType("2048")
- SetValidLength("3,10")
- SetWritingStyle("2")
- SetLineRemovalMode("2")
The rule that is called "Rule Configure First Name" might be attached to the open event for any field that uses these same settings. If other fields need different settings, then a field might be attached to a rule with different settings. Additional rules can be created in this ruleset, or in separate rulesets.
Once all of the fields have been configured with the proper settings, then the Recognize action needs to be called at the page level and it recognizes each field according to the settings. If you want to have the Recognize in the same ruleset as the ruleset that configures the fields, then the rule that calls recognize would need to be attached to the page "close" event so it runs after the fields are configured. Alternatively, the Recognize action can be called on a page open event in a follow on ruleset. The ruleset might look something like this:- Rule Recognize
- Function Page Level
- Recognize()
- Function Page Level
In this ruleset, the rule "Rule Recognize" might be attached to the page "open" event and this ruleset can be run after the field configuration ruleset. Recognition would then run on each field when this rule is performed.
When working with fields, be sure that the correct page type is set, the correct fingerprint template is associated to the page and that the zones on the page have been loaded. If the zones do not have positions that are loaded, then recognition cannot be performed. There is a field on the page that should not be recognized when Recognize is called, the field is skipped if the field DCO variable "p_sr" is set to "1". This might be performed by calling rrSet("1","@X.p_sr") on the field level object to set the variable. Alternatively, it might be set by calling rrSet("1","@P\MyField.p_sr"), where "MyField" is the name of the field, on the page object before calling Recognize. The help for each of the available field settings to determine how they should be set is available. It is highly recommended that recognition be tested with various settings to determine the setting combinations that work best for the actual documents that are processed by the application.