The custom global analysis logic is implemented by creating a Jaql (Query Language for JSON) script.
The inputs for the script are the fields, facets, and text that are extracted from the content during the document processing stage. Use the readGAInput(GAOptions) function to get document fields, facets, and text content in JSON format. The output from the script can be stored as document fields or facets in the Watson Explorer Content Analytics index by using the writeGAOutput(GAOptions) function. GAOptions is a JSON record that contains the necessary parameters. GAOptions can be obtained by using the getGAOptions($MetaTrackerJaqlVars) function. $MetaTrackerJaqlVars is always needed as an argument. To call these functions, modules with the namespace ica::ga must be imported. The following example shows a sample custom global analysis Jaql script:
import ica::ga(*);
options:=getGAOptions($MetaTrackerJaqlVars);
readGAInput(options)
-> someOperation()
-> anotherOperation()
-> writeGAOutput(options);
The function readGAInput() returns an array of JSON records. Each record represents a separate document that was processed by Watson Explorer Content Analytics. Each record can contain field values, facet values, and textual content, as configured in the administration console. The following table lists the fields that can be included in the JSON records.
Field name | Required or Optional | Remarks |
---|---|---|
uri | Required | The document ID, such as the URL. |
content | Required | Contains the text that the parser extracted from the document content |
metadata | Required | Contains information about the document metadata. |
fields | Optional | An array of records that contain information about the metadata fields. Each record contains the name and value fields |
name | Optional | The name of a document field. |
value | Optional | The value of a document field. |
docfacets | Optional | An array of records that contain information about the metadata facets. Each record contains the path and keyword fields. |
path | Optional | The facet path. |
keyword | Optional | A value associated with this facet. |
textfacets | Optional | An array of records that contain information about the facets that comes from an annotation. Each record contains the begin, end, path, and keyword fields. |
begin | Optional | For facets that come from an annotation, the character position that marks the beginning of the annotation. |
end | Optional | For facets that come from an annotation, the character position that marks the end of the annotation. |
[
{
"uri" : "jdbc://ICA/APP.CLAIM/ID/0",
"content" : "[Pack] The straw was peeled off from the juice pack.",
"metadata" : {
"fields" : [ {
"name" : "date",
"value" : "1199113200000"
}, {
"name" : "title",
"value" : "lemon tea - Package / container"
} ],
"docfacets" : [ {
"path" : [ "date", "2008", "1", "1", "0" ],
"keyword" : ""
}, {
"path" : [ "product" ],
"keyword" : "lemon tea"
} ],
"textfacets" : [ {
"begin" : 1,
"end" : 5,
"path" : [ "_word", "noun", "general" ],
"keyword" : "pack"
}, {
"begin" : 11,
"end" : 16,
"path" : [ "_word", "adj" ],
"keyword" : "straw"
}]
}
}, {
"uri" : "jdbc://ICA/APP.CLAIM/ID/1",
"content" : "I got some ice cream for my children, but there was something like a piece of thread inside the cup.",
"metadata" : {
"fields" : [ {
"name" : "date",
"value" : "1199199600000"
},{
"name" : "title",
"value" : "vanilla ice cream - Contamination / tampering"
} ],
"docfacets" : [ {
"path" : [ "date", "2008", "1", "2", "0" ],
"keyword" : ""
}, {
"path" : [ "product" ],
"keyword" : "vanilla ice cream"
} ],
"textfacets" : [ {
"begin" : 2,
"end" : 5,
"path" : [ "_word", "verb" ],
"keyword" : "get"
}, {
"begin" : 11,
"end" : 14,
"path" : [ "_word", "noun", "general" ],
"keyword" : "ice"
}]
}
}
]
[{"uri":"jdbc://ICA/APP.CLAIM/ID/0","rank":"1","$.ranking":"1"},
{"uri":"jdbc://ICA/APP.CLAIM/ID/1","rank":"2","$.ranking":"2"}
]
For index fields, the value of the JSON record field is stored in a new index field. For the name of the new index field, the prefix custom_ is added to the name of the index field. In the previous example, if an index field with the name rank is configured for the collection, a new index field with the name custom_rank and the value 1 is added to the document in the index. Some attributes of the custom_ index fields are inherited from the original index field, as described in the following table. For example, if the parametric searchable attribute is selected for the rank index field, the custom_rank index field is also parametric searchable.
Attributes of custom_ search fields | How value is determined |
---|---|
Returnable |
Inherited from the original index field. |
Faceted search |
To create a facet, directly assign a value to the facet by using notation that starts with $. to indicate the facet path. |
Free text search |
Not free text searchable |
In summary |
Not in summary |
Fielded search |
Inherited from the original index field. |
Exact match |
Inherit from the original index field. |
Case sensitive |
Inherit from the original index field. |
Parametric search |
Inherit from the original index field. |
Text sortable |
Inherit from the original index field. |
Analyzable |
Not analyzable |
For facets, if the collection includes a facet with the same facet path as the JSON record field name, the value of the JSON record field is stored to that facet. In the previous example, if there is an existing facet with the facet path $.ranking, the value 1 is stored in that facet. When you specify the facet path, ensure that the facet path starts with $ and that each facet path component is concatenated by a period. For example, the facet path $.ranking corresponds to the root facet with the name ranking.
You can also specify in the Jaql script to save the output in a file or some other format so that another application can use the data. For example, you can output the data to a JSON file on the local computer:
readGAInput(options)
-> someOperation()
-> write(file('/home/biadmin/ica_out.json'));