String extraction
The MatchPattern()
rule can also extract text
from the target string by creating variables based on sub-patterns found in
the target string.
These variables take the name REGEXN
,
where N
is a number which identifies the
variable.
Example
The following example shows how variables are assigned.
[1]text target="Testing Testing One 2 Three";
[2]text pattern="([^[:space:]]+)[[:space:]]+([^[:space:]]+)[[:space:]]+
([^[:space:]]+)[[:space:]]+([[:digit:]]+)[[:space:]]+([^[:space:]]+).*";
[3]int count=MatchPattern( target,pattern );
[4]while( count > 0 )
[5] {
[6] Print("Count ", count);
[7] Print("Match ", REGEX0);
[8] Print("SubMatch1 ", REGEX1);
[9] Print("SubMatch2 ", REGEX2);
[10] Print("SubMatch3 ", REGEX3);
[11] Print("SubMatch4 ", REGEX4);
[12] Print("SubMatch5 ", REGEX5);
[13]
[14]
[15] count = count - 1;
}
The round brackets in the pattern string defined
in line 2 identify a subpattern to be extracted and placed within a REGEX
variable.
In this example there is one match only; hence there is one variable (REGEX0
),
which matches the full pattern, and five sub-matches, (REGEX1
to REGEX5
),
set as shown in the following table.
Variable | Value | Type |
---|---|---|
REGEX0 |
Testing Testing One 2 Three |
Match |
REGEX1 |
Testing |
Sub-match |
REGEX2 |
Testing |
Sub-match |
REGEX3 |
One |
Sub-match |
REGEX4 |
2 |
Sub-match |
REGEX5 |
Three |
Sub-match |
If more than one match is found, more REGEX
variables
are created.
For example, if there were three matches, then 18 REGEX
variables
would be created. In this case, the variables REGEX0
, REGEX6
,
and REGEX12
would correspond to the full matching pattern,
while variables REGEX1-5
, REGEX7-11
, and REGEX13-18
would
correspond to the subpatterns within each full match.
MatchPattern()
rule a second time, then it overwrites
previous values held in the REGEX
variables. It is recommended
that you store any extracted data prior to running the MatchPattern()
rule
a second time.