Copying information
The COPY action copies information from a source to a target.
The format is:
COPY source target
The source can be any of the following:
Type | Description |
---|---|
operand | A pattern operand ([1], [2], ...) |
substring operand | A substring of the pattern operand |
mixed operand | Leading numeric or character subset |
user variable | A user-defined variable |
field name | A key reference ({StreetName}, ...) |
literal | A string literal in quotes ("SAINT") |
constant | A numeric value |
The target can be:
Type | Description |
---|---|
field name | A dictionary field ({StreetName}, ...) |
user variable | A user-defined variable |
For example, a United States address pattern that matches to 123 N MAPLE AVE is:
^ | D | + | T
This is accomplished by the following pattern action set:
^ | D | + | T
COPY [1] {HouseNumber}
COPY [2] {StreetPrefixDirectional}
COPY [3] {StreetName}
COPY [4] {StreetSuffixType}
EXIT
The following operations occur:
- The number operand [1] moves to the house number field {HouseNumber}
- The class D operand (direction) moves to the prefix direction field {StreetPrefixDirectional}
- The unknown alphabetic operand moves to the street name field {StreetName}
- The street type (class T) to the {StreetSuffixType} field.
Copying substrings
A substring of an operand is copied by using the substring operand form.
The simplest form of the COPY action is copying an operand to a dictionary field value. For example:
COPY [2] {StreetName}
The substring operand only operates on standard operands or user variables. The form is:
COPY source(b:e) target
The b
is the beginning column of the
string and the e
is the ending column. The following
example copies the first character (1:1) of operand 2 to the street
name field:
COPY [2](1:1) {StreetName}
The following example copies the second through fourth characters of the contents of the variable temp to the {StreetName} field:
COPY temp(2:4) {StreetName}
You can use a negative one (–1) to indicate the last character. A negative two (–2) indicates the next to last character, and so on. The following example copies the last three characters of operand 2 to the street name field:
COPY [2](-3:-1) {StreetName}
Copying leading and trailing characters
When handling leading alpha, leading numeric, or mixed class tokens, you can isolate the substrings based on the character type.
The four possible mixed operand specifiers are:
Specifiers | Description |
---|---|
(n) | All leading numeric characters |
(–n) | All trailing numeric characters |
(c) | All leading alphabetic characters |
(–c) | All trailing alphabetic characters |
These specifiers can be used for standard operands or for user variables.
For example, with the address 123A MAPLE AVE, you want the numbers 123 to be recognized as the house number and the letter A to be recognized as a house number suffix. You can accomplish this with the following pattern:
> | ? | T
COPY [1](n) {HouseNumber}
COPY [1](-c) {HouseNumberSuffix}
COPY [2] {StreetName}
COPY_A [3] {StreetSuffixType}
EXIT
Note that the first operand > is the appropriate class (leading numeric). These leading and trailing specifiers are mostly used with the > OR < operands. However, the trailing and leading specifiers can be used to separate user variables too. In the following example, XYZ is copied to StreetName:
COPY "456XYZ" tmp2
COPY tmp2(-c) {StreetName}
Copying user variables
The type of a target user variable is determined by the type of the source.
Copy samples | Description |
---|---|
COPY [1] temp |
Operand 1 is copied to a variable named temp. |
COPY "SAINT" temp |
The literal "SAINT" is copied to the variable temp. |
COPY temp1 temp2 |
The contents of variable temp1 is copied to temp2. |
COPY temp1(1:3) temp2 |
The first three characters of temp1 are copied to temp2. |
User variables can consist of 1 to 32 characters where the first character is alphabetic and the other characters are alphabetic, numeric, or an underscore (_) character.
Copying dictionary columns
Dictionary columns can be copied to other dictionary columns or to user variables.
The following example shows dictionary columns being copied to other dictionary columns or user variables:
COPY {HouseNumber} {HC}
COPY {HouseNumber} temp
Copying standardized abbreviations
The COPY_A action copies the standardized abbreviation for an operand to a target. Whereas, the COPY action copies the input to a target.
Standardized abbreviations are coded for entries in the classifications for the rule set. They are not available for default classes, such as a number, an alphabetic unknown, and so on.
You can use the COPY_A action to copy the abbreviation of an operand to either the dictionary column or a user variable. The following shows an example:
^ | ? | T
COPY [1] {HouseNumber}
COPY [2] {StreetName}
COPY_A [3] {StreetSuffixType}
The third line copies the abbreviation of operand three to the street type column. Similarly, the following example copies the standard abbreviation of operand three to the variable named temp:
COPY_A [3] temp
Abbreviations are limited to a maximum of 25 characters.
The COPY_A action copies the standardized abbreviation to the dictionary field rather than the original token. COPY_A can include a substring range, in which case the substring refers to the standard abbreviation and not the original token, as in COPY.
Copying with spaces
You can preserve spaces between words by using the COPY_S action.
When you use COPY to copy an alphabetic operand (?) or a range of tokens (**) to a dictionary column or to a user variable, the individual words are concatenated together.
COPY_S requires an operand as the source and either a dictionary column or a user variable as a target. For example, with the following input string:
123 OLD CHERRY HILL RD
A standard COPY produces OLDCHERRYHILL,
but in the following pattern, COPY_S can be used
as shown here:
^ | ? | T
COPY [1] {HouseNumber}
COPY_S [2] {StreetName}
COPY_A [3] {StreetSuffixType}
The {StreetName} column contains: OLD CHERRY HILL.
If you use the universal matching operand, all tokens in the specified range are copied. For example, consider removing parenthetical comments to a column named {AdditionalInformation}. If you have the following input address:
123 MAIN ST (CORNER OF 5TH ST) APARTMENT 6
You can use the following pattern to move CORNER OF 5TH ST to the column {AdditionalInformation}. The second action moves the same information to the user variable temp:
\( | ** | \)
COPY_S [2] {AdditionalInformation}
COPY_S [2] temp
Only when copying the contents of an operand does COPY remove spaces; thus, the restriction that the source for a COPY_S action can only be an operand. For all other sources (literals, formatted columns, and user variables), COPY preserves spaces.
Copying the closest token
The COPY_C action copies corrected input values as matched to entries in the classifications (.CLS). When a token has an uncertainty threshhold and is in the classifications, that you can copy a corrected version of the input rather than the abbreviation value.
When matching under uncertainty to entries in the classifications, you might want to use COPY_C action so that the complete token is spelled correctly rather than copying an abbreviation.
For example, if you have state name table with an entry such as:
MASSACHUSETTS MA S 800.0
If Massachusetts is misspelled on an input record (for
example, Masssachusettts), you want to copy the correct
spelling to the dictionary column. The following action places the
full correctly spelled token MASSACHUSETTS
in the
proper column:
COPY_C [operand-number] {column name}
For COPY_C, source can only be an operand because COPY_C uses the closest token from classifications.
Copying initials
The COPY_I action copies the initial character (first character of the relevant tokens) from a source to the dictionary column rather than the entire value.
The dictionary column can be a dictionary field or a user variable. You can use COPY_I within a pattern action set or as a POST action.
The value MAPLE puts the M into {NameAcronym}, if you use COPY_I in the following manner:
?
COPY_I [1] {NameAcronym}
For a multi-token alphabetic string such as John Henry Smith, the output value depends on the target. If the target is a user variable, the output value from the COPY_I action is JHS.
If you use COPY_I as a POST action, the source must be a dictionary column and the target must be a dictionary column. You generally use COPY_I to facilitate array matching.
For example, the company name INTERNATIONAL BUSINESS MACHINES is distributed into dictionary columns C1 through C5 (such that C1 contains INTERNATIONAL, C2 BUSINESS, C3 MACHINES, and C4 and C5 are blank). The following set of POST actions put the value IBM into the company initials dictionary column CI.
\POST_START
COPY_I {C1} {CI}
CONCAT_I {C2} {CI}
CONCAT_I {C3} {CI}
CONCAT_I {C4} {CI}
CONCAT_I {C5} {CI}
\POST_END
When used as a POST action, COPY_I takes only the first character from the source column. For example, if, in the previous sample, C1 INTERNATIONAL DIVISION, the result is still IBM.