You can specify a list of expected languages that the OCR_A
Recognize action will need to detect. After you set this list, the
recognition engine detects the language automatically while OCR runs
in the Recognize action.
Automatic language detection is enabled by setting the y_lg variable
on the page and specifying at least three languages in the variable.
If the variable is not set, then the language that is recognized is
based on the current locale.
Tip: Minimize the list
of languages to only the languages that are expected to be processed
by the application. The more languages are specified, the slower the
processing.
Although the Fine Reader engine supports recognition of a vast number of languages, the language
auto-detection feature only works on those languages for which is implemented full dictionary
support. These include most popular languages, but does not include Simplified Chinese or
Traditional Chinese.
- Use the rrSet action or a similar action
to set the y_lg variable.
- Set the y_lg variable to a comma separated
list of at least three languages from the following list:
Important: When setting the comma separated list of languages,
be sure that the languages are spelled as written below. An invalid
language name will cause the action to abort.
- Languages supported in automatic language detection
-
Remember: This is a list of language names that set the scope of automatic language
detection in the OCR engine. This is not a list of languages supported by
IBM® Datacap. For information on language support, search the
IBM Support Portal for the language support techdoc applicable to your
version of
Datacap.
- ArmenianEastern
- ArmenianGrabar
- ArmenianWestern
- AzeriLatin
- Bashkir
- Bulgarian
- Catalan
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Finnish
- French
- German
- GermanNewSpelling
- Greek
- Hungarian
- Indonesian
- Italian
- Japanese
- Korean
- KoreanHangul
- Latin
- Latvian
- Lithuanian
- Norwegian
- NorwegianBokmal
- NorwegianNynorsk
- OldEnglish
- OldFrench
- OldGerman
- OldItalian
- OldSpanish
- Polish
- PortugueseBrazilian
- PortugueseStandard
- Romanian
- RussianOldSpelling
- Russian
- RussianWithAccent
- Slovak
- Slovenian
- Spanish
- Swedish
- Tatar
- Turkish
- Ukrainian
- Call the Recognize action.
rrSet("English,French,Japanese", "@P.y_lg")
Recognize()
CreateCcoFromLayout()
This sequence creates a layout
XML file and subsequently a CCO file for the current page. Auto detection
is enabled for English, French, and Japanese documents. The CCO file
that is produced is ready for use by navigation and pattern matching
actions.