Using regular expressions
- Account
- Matches the characters
Account
. By default, searches are case-sensitive. - [A-Z]
- Matches one uppercase letter.
- [A-Z] {3}
- Matches for three consecutive uppercase letters.
- [0-9] {5}
- Matches five consecutive digits.
- [0-9]+
- Matches one or more digits.
- [^a-z]
- Matches everything except lowercase a to z.
- \s
- Matches one whitespace character, such as space or tab.
- \S
- Matches any character that is not whitespace.
\d = [0-9]
\p(L) = [A-z]
\r = 0x0DFor additional information on compile regular expression syntax, see https://www.ibm.com/docs/en/zos/2.5.0?topic=functions-regcomp-compile-regular-expression.
ACIF can use a regular expression in the TRIGGER and FIELD parameters. In the TRIGGER parameter, the regular expression specifies the pattern to search for. In the FIELD parameter, the regular expression is applied to the characters that are extracted from the field in a way that is similar to using a mask. The regular expression must be specified in the code page indicated by the CPGID parameter.
CPGID=037
TRIGGER1=*,*,'PAGE',(TYPE=GROUP)
TRIGGER2=*,25,REGEX='[A-Z]{3}-[A-Z]{6}',(TYPE=FLOAT)
FIELD1=0,9,2,(TRIGGER=1,BASE=TRIGGER)
FIELD2=0,38,10,(TRIGGER=2,BASE=0,REGEX='[A-Z] [0-9]{3}-\S+')
INDEX1='Page',FIELD1,(TYPE=GROUP,BREAK=YES)
INDEX2='Source-ID',FIELD2In
the example, TRIGGER2 uses a regular expression, which specifies a pattern of three uppercase
letters, a hyphen, and six uppercase letters. The text "SUB-SOURCE" matches the pattern. FIELD2 uses
a regular expression, which specifies one uppercase letter, a space, three numbers, a hyphen, and
one or more non-whitespace characters. The character strings Q 010-1,
I 000-RS, and
L 133-1Bmatch the regular expression pattern.
CPGID=850
TRIGGER1 = *,1,REGEX = X'5B302D395D7B337D' /* [0-9]{3} */Using a regular expression on the TRIGGER parameter
On the TRIGGER parameter, use the regular expression instead of a text string. A regular expression can be used on both a group trigger and a floating trigger. The maximum length of the regular expression is 250 bytes.
If an asterisk is specified for the column, ACIF searches the entire record for the string that matches the regular expression. If a column is specified, ACIF searches the text starting in that column for the string that matches the regular expression. The regular expression must match text that begins in that column. If a column range is specified, ACIF searches only the text within the column range for the string that matches the regular expression. The regular expression must match text that begins in one of the columns specified by the column range.
The maximum record length to which the regular expression can be applied is 2 KB (2048 bytes). If longer records are in the file, use a trigger column range to specify a subset of the record. When the regular expression matches the text in a record, ACIF looks for the next trigger, or, if all the group triggers are found, ACIF collects the fields.
Using a regular expression on the FIELD parameter
On the FIELD parameter, use the regular expression instead of a mask. A mask and a regular expression cannot both be specified on the same FIELD parameter. The maximum length of the regular expression is 250 bytes.
The regular expression can be specified on a field based on a group trigger, a field based on a floating trigger, or a transaction field. Masks can be specified only on fields based on floating triggers and transaction fields. The maximum length of a field that can be specified in the FIELD parameter is 250 bytes.
- For a field based on a group trigger, the default value that is specified on the FIELD parameter is used. If no default value is specified, ACIF ends with error message APK488S.
- If the record is only long enough to contain part of the field, the regular expression is applied only to the portion of the record that is present.
Using default values when regular expressions do not match
- GROUP field
- If a regular expression does not match any text in the GROUP field, the default value that is specified on the FIELD parameter is used. If no default value is specified, ACIF ends processing with error message APK488S.
- If the record is only long enough to contain part of the field, the regular expression is applied only to the portion of the record that is present.
- If the record is not long enough to contain even the first byte of the field, the default value that is specified on the FIELD parameter is used. If no default value is specified, ACIF ends with error message APK449S.
- FLOAT field
- If a regular expression does not match any text in the FLOAT field, no error exists, and the default value that is specified on the FIELD parameter is not used.
- If the record is only long enough to contain part of the field, the regular expression is applied only to the portion of the record that is present.
- If the record is not long enough to contain even the first byte of the field, the default value that is specified on the FIELD parameter is used. If no default value is specified, ACIF ends processing with error message APK449S.
- Transaction fields (GROUPRANGE and PAGERANGE)
- If the regular expression does not match any text in the transaction field, no error exists, and processing continues. A default value cannot be specified for a transaction field.
- If the record is not long enough to contain the entire field, no error exists, and processing continues.
Other considerations for using regular expressions
- Performance might be slower when you are using a regular expression than when you are using a text string.
- If the CPGID value is incorrect, the conversion might fail with error message APK2080I.
If the regular expression is not valid, ACIF fails with error message APK484S.
Examples of using regular expressions
Using a regular expression for a trigger
TRIGGER1=*,1,REGEX='P[A-Z]{3} ',(TYPE=GROUP) In
this example, the regular expression matches text that begins in column
1 with the letter P
, three uppercase letters, and a space.
For example, PAGE
.
Using a regular expression to extract a date
TRIGGER1=*,1,'1'
FIELD1=0,13,18,( REGEX='[A-Z][a-z]+ [0-9]+, [0-9]{4}',DEFAULT='January 1, 1970')
INDEX1='Date',FIELD1 In this example, the regular expression matches a date in the form that begins with an uppercase
letter, one or more lowercase letters, a space, one or more digits, a comma, a space, and four
digits. For example, July 4, 1956
. If a date is not found that matches the regular expression
pattern, a default of January 1, 1970
is used.
Using a regular expression with a transaction field
TRIGGER1=*,1,'1'
FIELD1=0,30,3
FIELD2=*,*,12,(OFFSET=(59:70),ORDER=BYROW,REGEX='[0-9]{3}-[0-9]{2}-[0-9]{4}')
INDEX1='DEPT',FIELD1,(TYPE=GROUP)
INDEX2='SOCIAL SECURITY NUMBER',FIELD2,(TYPE=GROUPRANGE) In this example, the regular expression is used to extract social security numbers that begin with three digits, a hyphen, two digits, a hyphen, and four digits.