ESQL field reference overview

You can use ESQL field references to form paths to message body elements.

The full syntax for field references is shown in the following examples:

PathElement

A field reference consists of a correlation name, followed by zero or more Path Elements separated by periods (.). The correlation name identifies a well-known starting point and must be the name of a constant, a declared variable (scalar, row or reference), or one of the predefined start points; for example, InputRoot. The path Fields define a path from the start point to the desired field.

See:

Namespace for the meaning of the different combinations of namespace and name
Target field references for the meaning of the different combinations of field references
Index for the meaning of the different combinations of index clauses
Type for the meaning of the different combinations of types

For example:

InputRoot.XMLNS.Data.Invoice

starts the integration node at the location InputRoot (that is, the root of the input message to a Compute node) and then performs a sequence of navigations. First, it navigates from root to the first child field called XMLNS, then to the first child field of the XMLNS field called Data. Finally, the integration node navigates to the first child field of the Data field called Invoice. Whenever this field reference occurs in an ESQL program, the invoice field is accessed.

This form of field reference is simple, convenient, and is the most commonly used. However, it does have two limitations:

Because the names used must be valid ESQL identifiers, you can use only names that conform to the rules of ESQL. That is, the names can contain only alphanumeric characters including underscore, the first character cannot be numeric, and names must be at least one character long. You can avoid these limitations by enclosing names not conforming to these rules in double quotation marks. For example:
```
InputRoot.XMLNS."Customer Data".Invoice
```
If you need to refer to fields that contain quotation marks, use two pairs of quotation marks around the reference. For example:
```
Body.Message."""hello"""
```
Some identifiers are reserved as keywords but, with the exception of the correlation name, you can use them in field references without the use of double quotation marks
Because the names of the fields appear in the ESQL program, they must be known when the program is written. This limitation can be avoided by using the alternative syntax that uses braces ( { ... } ). This syntax allows you to use any expression that returns a non-null value of type character.
For example:
```
InputRoot.XMLNS."Customer Data".{'Customer-' || 
	CurrentCustomer}.Invoice
```
in which the invoices are contained in a folder with a name is formed by concatenating the character literal Customer- with the value in CurrentCustomer (which in this example must be a declared variable of type character).

You can use the asterisk (*) wildcard character in a path element to match any name. For example:

   InputRoot.XMLNS.*.Invoice.Value

matches any path element in which the invoices are contained.

Note that enclosing anything in double quotation marks in ESQL makes it an identifier; enclosing anything in single quotation marks makes it a character literal. You must enclose all character strings in single quotation marks.

Namespace

Field names can belong to namespaces. Field references provide support for namespaces as follows:

Each field of each field reference that contains a name clause can also contain a namespace clause defining the namespace to which the specified name belongs.
Each namespace name can be defined by either a simple identifier or by an expression (enclosed in curly braces). If an identifier is the name of a declared namespace constant, the value of the constant is used. If an expression is used, it must return a non-null value of type character.
A namespace clause of * explicitly states that namespace information is to be ignored when locating Fields in a tree.
A namespace clause consisting of only : explicitly targets the notarget namespace. The clause has no identifier, expression or wildcard (*).

For example:

DECLARE sp1 NAMESPACE 'http://www.ibm.com/space1';

/* Namespace declaration to associate prefix 'space1' with the namespace */

SET OutputRoot.XMLNS.TestCase.(XML.NamespaceDecl)xmlns:space1 = 'http://www.ibm.com/space1'; 
SET OutputRoot.XMLNS.TestCase.sp1:data1 = 'Hello!';

generates:

<TestCase xmlns:space1="http://www.ibm.com/space1">
<space1:data1>Hello!</space1:data1>
</TestCase>

Index

Each field of a field reference can contain an index clause. This clause is denoted by brackets ( [ ... ] ) and accepts any expression that returns a non-null value of type integer. This clause identifies which of several fields with the same name is to be selected. Fields are numbered, starting at one. If this clause is not present, it is assumed that the first field is required. Therefore, the following two examples have exactly the same meaning:

InputRoot.XMLNS.Data[1].Invoice
InputRoot.XMLNS.Data.Invoice[1]

This construct is most commonly used with an index variable, so that a loop steps though all such fields in sequence. For example:

WHILE count < 32 DO
     SET TOTAL = TOTAL + InputRoot.XMLNS.Data.Invoice[count].Amount;
     SET COUNT = COUNT + 1 
END WHILE;

Use this kind of construct with care, because it implies that the integration node must count the fields from the beginning each time round the loop. If the repeat count is large, performance will be poor. In such cases, a better alternative is to use a field reference variable.

Index expressions can optionally be preceded by a less-than sign ( < ), indicating that the required field is to be indexed from the last field, not the first. In this case, the index 1 refers to the last field and the index 2 refers to the penultimate field. For completeness, you can use a greater-than sign to indicate counting from the first field. The following example shows ESQL code that handles indexes where there are four fields called Invoice.

InputRoot.XMLNS.Data.Invoice       -- Selects the first
InputRoot.XMLNS.Data.Invoice[1]    -- Selects the first
InputRoot.XMLNS.Data.Invoice[>]    -- Selects the first
InputRoot.XMLNS.Data.Invoice[>1]   -- Selects the first
InputRoot.XMLNS.Data.Invoice[>2]   -- Selects the second
InputRoot.XMLNS.Data.Invoice[<]    -- Selects the fourth
InputRoot.XMLNS.Data.Invoice[<1]   -- Selects the fourth
InputRoot.XMLNS.Data.Invoice[<2]   -- Selects the third
InputRoot.XMLNS.Data.Invoice[<3]   -- Selects the second

An index clause can also consist of an empty pair of brackets ( [] ). This selects all fields with matching names. Use this construct with functions and statements that expect lists (for example, the SELECT, CARDINALITY, SINGULAR, and EXISTS functions, or the SET statement) .

Type

Each field of a field reference can contain a type clause. These are denoted by parentheses ( ( ) ), and accept any expression that returns a non-null value of type integer. The presence of a type expression restricts the fields that are selected to those of the matching type. This construct is most commonly used with generic XML, where there are many field types and it is possible for one XML field to contain both attributes and further XML Fields with the same name.

For example:

<Item Value = '1234'>
     <Value>5678</Value>
</Item>

Here, the XML field Item has two child Fields, both called Value. The child Fields can be distinguished by using type clauses: Item.(<Domain>.Attribute)Value to select the attribute, and Item.(XML.Element)Value to select the field, where <Domain> is one of XML, XMLNS, or XMLNSC, as determined by the message domain of the source.

Type constraints

A type constraint checks the data type returned by a field reference.

Notes:

¹ ScalarDataTypeName can be any one of BOOLEAN, INTEGER, INT, FLOAT, DECIMAL, DEC, DATE, TIME, TIMESTAMP, GMTTIME, GMTTIMESTAMP, INTERVAL, CHARACTER, CHAR, BLOB, BIT.

Typically, a type constraint causes the scalar value of the reference to be extracted (in a similar way to the FIELDVALUE function) and an exception to be thrown if the reference is not of the correct type. By definition, an exception will be thrown for all nonexistent fields, because these evaluate to NULL. This provides a convenient and fast way of causing exceptions if essential fields are missing from messages.

However, when type constraints occur in expressions that are candidates for being passed to a database (for example, they are in a WHERE clause), the information is used to determine whether the expression can be given to the database. This can be important if a WHERE clause contains a CAST operating on a database table column. In the absence of a type constraint, such expressions cannot be given to the database because the integration node cannot tell whether the database is capable of performing the required conversion. Note, however, that you should always exercise caution when using casts operating on column values, because some databases have exceedingly limited data conversion capabilities.

Field references summary

*, *[..], (..)*, (..)*[..]

None of these forms specifies a name or namespace. The target field can have any name, in any namespace or in no namespace. It is located solely by its type, its index, or its type and index, as appropriate.

Name, Name[..], (..)Name, (..)Name[..]

All these forms specify a name but no namespace. The target field is located by namespace and name, and also by type and index where appropriate.

The namespace is taken to be the only namespace in the namespace path containing this name. The only namespace that can be in the path is the notarget namespace.

These forms all existed before namespaces were introduced. Although their behavior has changed in that they now compare both name and namespace, existing transforms should see no change in their behavior because all existing transforms create their Fields in the notarget namespace.

:*, :*[..], (..):*, (..):*[..]

All these forms specify the notarget namespace but no name. The target field is located by its namespace and also by type and index where appropriate.

:Name, :Name[..], (..):Name, (..):Name[..]

All these forms specify a name and the notarget namespace. The target field is located by namespace and name and also by type and index where appropriate.

*:*, *:*[..], (..)*:*, (..)*:*[..]

None of these forms specifies a name or a namespace. Note that *:* is equivalent to *, and matches no namespace as well as any namespace. The target field can have any name, in any namespace or in no namespace. It is located solely by its type, its index, or its type and index, as appropriate.

*:Name, *:Name[..], (..)*:Name, (..)*:Name[..]

All these forms specify a name but no namespace. The target field is located by name and also by type and index where appropriate.

Namespace:*, Namespace:*[..], (..)Namespace:*, (..)Namespace:*[..]

All these forms specify a namespace but no name. The target field is located by namespace and also by type and index where appropriate.

Namespace:Name, Namespace:Name[..], (..)Namespace:Name, (..)Namespace:Name[..]

All these forms specify a namespace and name. The target field is located by namespace and name and also by type and index where appropriate.

In all the preceding cases a name, or namespace, provided by an expression contained in braces ({}) is equivalent to a name provided as an identifier.

By definition, the name of the notarget namespace is the empty string. The empty string can be selected by expressions which evaluate to the empty string, the empty identifier "", or by reference to a namespace constant defined as the empty string.

Target field references

The use of field references usually implies searching for an existing field. However, if the required field does not exist, as is usually the case for field references that are the targets of SET statements and those in the AS clauses of SELECT functions, it is created.

In these situations, there are a variety of circumstances in which the integration node cannot tell what the required name or namespace is, and in these situations the following general principles apply :

If the name clause is absent or does not specify a name, and the namespace clause is absent or does not specify or imply a namespace (that is, there is no name or namespace available), one of the following conditions applies:
- If the assignment algorithm does not copy the name from some existing field, the new field has both its name and namespace set to the empty string and its name flag is not set automatically.
  In the absence of a type specification, the field's type is not Name or NameValue, which effectively indicates that the new field is nameless
  .
- Otherwise, if the assignment algorithm copies the name from some existing field, the new field has both its name and namespace copied from the existing field and its Name flag is set automatically
If the name clause is present and specifies a name, but the namespace clause is absent or does not specify or imply a namespace (that is, a name is available but a namespace is not), the new field has its:
- Name set to the given value
- Namespace set to the empty string
- Name flag set automatically
If the name clause is absent or does not specify a name, but the namespace clause is present and specifies or implies a namespace (that is, a namespace is available but a name is not), the new field has its:
- Namespace set to the given value
- Name set to the empty string
- Name flag set automatically
If the name clause is present and specifies a name, and the namespace clause is present and specifies or implies a namespace, the new field has its:
- Name set to the given value
- Namespace set to the given value
- Name flag set automatically

There are also cases where the integration node creates Fields in addition to those referenced by field references:

Tree copy: new Fields are created by an algorithm that uses a source tree as a template. If the algorithm copies the name of a source field to a new field, its namespace is copied as well.
Anonymous select expressions: SELECT clauses are not obliged to have AS clauses; those that do not have them, set the names of the newly created Fields to default values (see SELECT function).
These defaults can be derived from field names, column names or can simply be manufactured sequence names. If the name is an field name, this is effectively a tree copy, and the namespace name is copied as above.

Otherwise, the namespace of the newly-created field is derived by searching the path, that is, the name is be treated as the NameId syntax of a field reference.

The effect of setting a field to NULL

Take care when assigning a null value to a field. For example, the following command deletes the Name field:

 SET OutputRoot.XMLNS.Msg.Data.Name = NULL;  -- this deletes the field

The correct way to assign a null value to a field is as follows:

SET OutputRoot.XMLNS.Msg.Data.Name VALUE = NULL;  
-- this assigns a NULL value to a field without deleting it

Note: to users on compatibility with earlier versions

For compatibility with earlier versions, the LAST keyword is still supported, but its use is deprecated. LAST cannot be used as part of an index expression: [LAST] is valid, and is equivalent to [<], but [LAST3] is not valid.

The LAST keyword has been replaced by the following arrow syntax, which allows both a direction of search and index to be specified:

      Field [> ]                   -- The first field, equivalent to [ 1 ]
      Field [> (a + b) * 2 ]
      Field [ < ]                   -- The last field, equivalent to [ LAST ]
      Field [ < 1 ]	                -- The last field, equivalent to [ LAST ]
      Field [ < 2 ]	                -- The last but one field
      Field [ < (a + b) / 3 ]