Locate the field data
After you locate the label, you must locate the adjacent field data.
The field data is usually to the right of the label, but it might also be above or below the label. Additionally, you might need to group words together if the data you are searching for includes spaces.
The full page recognition engine organizes the recognition results in the CCO file as a coordinate-based grid of lines and words. Each word is assigned a different position in the grid.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Car | Rental | #4 | |||||||
| 2 | Pickup | Details | Return | Details | ||||||
| 3 | Date: | Mon, | Jan | 10, | 2011 | Date: | Fri, | Jan | 14, | 2011 |
| 4 | Time: | 11:00AM | Time: | 04:00PM | ||||||
| 5 | Location: | Orlando | (MCO) | Location: | Orlando | (MCO) |
This structure allows you to move around the recognition results by using the Locate library's navigation actions. The library also includes actions for grouping words together.
| Library | Action | Description |
|---|---|---|
| Locate | GoRightWord | Moves the specified number of words to the right of the previously found word or phrase. |
| Locate | GoDown(Up)Line | Moves down (up) the specified number of lines from the previously found word or phrase and selects the first word. |
| Locate | GroupWordsRIGHT(LEFT) | Groups words to the right (left) of the previously found word if they are no more than the specified number of character widths apart. |
| Locate | GroupWords | Groups words to the left and right of the previously found word if they are no more than the specified number of character widths apart. |
The TravelDocs Recognize ruleset contains a rule that searches for the word Date and then goes one word to the right to obtain the data. The GroupWordsRIGHT action is required because, without it, you would get only the first word (Mon, in the Car Rental #4 example). The parameter 2 instructs the rule to group words that are two or fewer character widths apart.