Feature paths

A feature path provides a way to access feature values in the common analysis structures, similar to XPath statements that are used to access XML elements in an XML document.

Feature paths are useful if you want to access a feature structure that combines complex features, for example features that are array valued or point to another feature structure. Using a feature path, you can associate the value of a feature directly with a feature structure, and store this value in the semantic search index or in a database.

For example, consider an annotator that identifies cars and their makes. It creates annotations of type car that have an attribute make. However, make does not contain the actual company (for example, Chevrolet) but contains a feature structure of type Company, which itself has a string-valued attribute companyname. To enable a semantic query that combines car names and company names, a feature path make/companyname is used to attach the value of companyname to the car span that is generated for the car annotation. This enables the query, "Give me documents that contain cars made by Chevrolet", by using '/car[@make="Chevrolet"]'.

A feature path is a sequence of feature names (f1/.../fn) with the following properties:

The value of a feature path can be String, Integer, Float, or an array of one of those types.
All features within the path from f1 to fn-1 must have a complex type, that is, of type uima.cas.TOP, uima.cas.FSArray, uima.cas.FSList, or of one of their subtypes.
The last feature fn in the path can include a complex type. Additionally, it can include a (sub-)type of uima.cas.Float, uima.cas.Integer, uima.cas.String, uima.cas.FloatArray, uima.cas.IntegerArray, uima.cas.StringArray, uima.cas.FloatList, uima.cas.IntegerList, or uima.cas.StringList.
Optionally, a feature can be typed. The fully qualified type name must be prepended to the feature name, and be separated by a colon. For example, f1/com.ibm.es.SomeType:f2/…/fn .

You can narrow the type scope of a particular feature. For example, consider a feature additionalInfo of type uima.cas.TOP. If you know that the value of your feature additionalInfo is actually of type EmployeeInfo which has the feature salary, you can access this feature using additionalInfo/EmployeeInfo:salary. Note that in this example, the feature path additionalInfo/salary would result in an error, as salary has not been defined for the type uima.cas.TOP.

Features that are array- or list-valued have the following additional properties:

Use brackets ([<number>]) to select a certain element in the array or list. An array starts at zero (0). For example, to select the first element in the companies array, use companies[0]. The special marker [last] can be used to select the last entry in an array, irrespective of its size, for example, companies[last].
Use empty brackets ([]) to denote all elements. Only one empty bracket ([]) is allowed in a feature path. For example, if there is an array of suspects, the feature path knownSuspects[]/com.ibm.omnifind.types.Suspect:surName collects all the last names of suspects into a String array.
When a feature path that returns an array is used during indexing, the array elements are concatenated (separated by white-spaces) and written to the index as a single, multi-term attribute or field.
The next element in the feature path must be typed. The type name is the type of the elements within the array. For example, consider a feature structure of type Info. This type has a feature named companies, whose range is an FSArray. The elements of the array are of type Company. Company, in turn, has a feature named profit. To obtain the profit of the third company, write (using fully qualified type names) companies[2]/Company:profit.