String affiliation

String affiliation is a language construct used to extract variables from plain text by using pattern-based templates.

Syntax

A string affiliation starts with a dollar sign followed by a backtick character ($`) or a double dollar sign followed by a backtick character ($$`), and ends at the closing backtick. These prefixes control how the literal parts of the pattern are matched:

  • Single dollar ($`): Ignores insignificant whitespace in the text during matching.
  • Double dollar ($$`): Matches text exactly, including all whitespace and characters.

The general pattern format is:

$`<text>{<placeholder>}[text]`

Placeholders must be enclosed in braces. Table 1 outlines how they help extract variables by using different syntax formats.

Table 1. Placeholder syntax
Syntax Description
{name:type} Uses a built-in type (for example number, string)
{name:"regex"} Uses a custom regular expression
{name:type:format description} Uses formatting for date, number, and other types.
Note: Built-in types include Boolean, number, string, date and time, and others. Any type that supports JSON serialization can also be used in a placeholder.

Examples

String affiliations are used together with the find first and find all constructs to extract variables from plain text.

  • find first matches the first occurrence of a pattern and declares variables from it.
  • find all performs multiple matches. It extracts variables from every matching part of the input. One rule is triggered for each successful match.

If any required variable is missing in a match, the rule is not triggered.

Basic pattern matches
The following example extracts a number from a sentence:
$`price is {value:number} euros`
From the text the price is 34 euros and this is cheap, value would be 34.
This example matches a name by using a custom regular expression:
$`Name: {name:"[A-Z][a-z]+"}`
Match a single name by using find first
Extracts a single name from input:

definitions
    find first $`Name: {name:"[A-Z][a-z]+"}` in input;
if
   { "Smith", "Johnson" } contains name
then
   set decision to name;
  
Match two numbers and check their sum
Extracts n1 and n2, both of type number, and checks if they add up to 96:

definitions
    find first $`first: {n1:number}, second: {n2:number}` in "first: 32, second: 64";
if n1 + n2 equals 96
then print "OK";
  
Match multiple formatted variables from structured text
Extracts a date and two numbers by using format descriptions:

definitions
    find first $`
        "StartDate":"{startDate: date: the pattern for a date using "MM/dd/yy"}",
        "TotalElectricityUse":"{totalElectricityUse: number: the pattern for a number using locale "en-US"}",
        "BillCost":"{billCost: number: the pattern for a number using "¤#,##0.00"}"
                ` in Text;
then
    print `StartDate: {startDate} TotalElectricityUse: {totalElectricityUse} BillCost: {billCost}`;
  
Use multiple find first patterns on the same input
Extracts values from three different patterns:

definitions
    find first $`Company: {company:"\\w+"}` in input;
    find first $`Quantities: {quantities:"\\d+(,\\s*\\d+)*"}` in input;
    find first $`ProductNames: {products:"\\w+(,\\s*\\w+)*"}` in input;
then
   set decision to a new order where
       company is 'company',
       quantities is 'quantities' splitted by ",\\s*",
       products is 'products' splitted by ",\\s*";
  
Match multiple names by using find all
Finds all matches for name in the input:

definitions
    find all $`Name: {name:"[A-Z][a-z]+"}` in input;
    set customer to a customer in 'customers' where the name of this customer is 'name';
if
    the category of customer is gold
then
    add customer to decision;