Creating an index mapping
Mapping allows you to define your fields of type text, numeric, date, boolean, an object type which groups multiple fields in it or an array of any of these types with the data types supported in the search engine. Along with the data type, mapping allows you to set the field characteristics such as searchable, returnable, sortable, how to store, date format, whether the strings must be treated as full text fields, and so on.
Order Service uses Elasticsearch as the search engine. In Elasticsearch, a document in the index contains fields of different data types. You can define explicit mapping at an index level.
This kind of auto-detection of data type and defaulting of indexing properties can result in indexing failures, inaccurate data, or unexpected index search results as illustrated in the following scenarios:
- Suppose you do not specify explicit mapping for a field, and the input text is 2016-01-26, which resembles a date but is not intended to be a date in the first document to index. Elasticsearch interprets and sets the type as date. If the value of this field in the next document is specified as 2016 01 27, spaces instead of dashes or 20160127 without dashes, Elasticsearch returns an exception indicating that the input text does not conform to the date format, and indexing fails. If, however, 2016-01-26 was specified as a string type in a mapping file, then Elasticsearch would not interpret 2016-01-26 as a date and the index failure would not have occurred.
- Suppose no explicit mapping is assigned for a string field, Elasticsearch interprets it as a text field and assigns the characteristics to analyzed. The analyzed specification parses the input data by using the built-in standard analyzer that tokenizes the data with a space as a delimiter, converts each of the tokens to lowercase, and stores the data. This may not be how you want to parse your data, resulting in an unexpected or inaccurate search results or no results at all. The ‘text’ data type is intended to support case-insensitive search and search by a part of the actual data. The Sterling™ Order Management System Software APIs do not support search by a part of the field value. So, if you want to get the Sterling Order Management System Software API behavior you must define the field data type as ‘keyword’.
- Elasticsearch results have two parts - Hits and Aggregations. Hits return the original documents indexed. But, in the case of term aggregations, results are derived from searchable terms. For a keyword field, a searchable term is same as the original value, whereas for a text field it is any one of the parts derived from the original value based on the analyzer. Each of part value becomes a candidate for aggregation and that leads to incorrect figures. So, for any string field which you plan to put aggregation in future when the Order Search supports it must have keyword type.
Because of these reasons, Sterling Order Management System Software recommends that you define explicit mapping. To index and search the order documents, a pre-defined schema consisting of mappings and settings to be created. This documentation helps in generating the schema as per the business requirements.
Field types
You can configure the data type of a field such as string or boolean and its intent by using the
type
attribute. For example, "type": "boolean"
. The common data
types supported by Order Search are as follows:
- String - You can save a string field as of type,
text
orkeyword
. Thetext
type is used to index full-text values, such as the description of a product. These fields are analyzed by an analyzer to convert the string into a list of individual terms before being indexed. Thetext
fields are best suited for unstructured but human-readable contents. Thekeyword
field type is used for structured content such as IDs, email addresses, hostnames, status codes, or zip codes, and the entire content of the field is indexed and searched as one unit. - Numeric – You can use the numeric field types to define fields that are holding numeric data. The various numeric field types supported includes long, integer, short, byte, double, float.
- Date – A field to hold a date type can be defined using the
date
type. This field can hold formatted date strings. - Boolean – This field accepts the JSON values
true
andfalse
. But, can also accept strings that are interpreted as eithertrue
orfalse
. - Object – You can use this field type for fields consisting of JSON objects, which can contain subfields.
- Arrays - This is a
nested
field type can be used for arrays of objects to be indexed in a way that they can be queried independently of each other.
Characteristics
You can group the field types under one or more of the following characteristics:
- Searchable - A searchable field is one which is indexed and the document containing the field
can be searched and retrieved by the value of the field. The behavior of a searchable field varies
based on whether the field is defined as
analyzed
ornon-analyzed
. - Returnable - A returnable field is one which is stored and the field value can be returned as part of the search response.
- Sortable - A sortable field is one, based on which the search results can be sorted in a
particular order, either
desc
orasc
. The search results can be ordered by one or more sortable fields.
Default behavior
The following table describes he default behavior of the various field types:
Field Type | Searchable | Analyzed | Returnable | Sortable |
Text | Yes | Yes | No | No |
Keyword | Yes | No | No | Yes |
Numeric | Yes | No | No | Yes |
Boolean | Yes | No | No | Yes |
Date | Yes | No | No | Yes |
object
or nested
field can be
searchable
, analysed
, returnable
or
sortable
based on the type and mapping parameters of these fields. In order to make
a sub-field of a nested
, returnable
, or sortable
field, to set the parameter, include_in_parent=true
.Mapping
The process of defining mappings is made flexible to suit your business needs. Mappings consist of a set of searchable and returnable fields such as shopper's name, phone number, address, email, item information, and other order-related information to feed an e-commerce application.
In Order Search, you must specify the mapping type for each of the fields in
the index document, create a JSON comprising of all the fields, and pass it to the createSearchIndex
API to create an
index in Elasticsearch based order-search repository. You can define a particular field as
searchable
and returnable
in the schema as shown in the following
example.
"BuyerUserId":{
"type" : "keyword",
"index" : true,
"store" : true
}
Here, the BuyerUserId
field is defined as of type of keyword
that is searchable by specifying index:true
and returnable by specifying
store:true
. You can search by the BuyerUserId
field and retrieve it
as part of the response.
The following sample explains the mapping:
{
"index":{
"id":"order",
"mappings":{
"BillToID":{
"type":"keyword",
"index":true,
"store":true
},
"OrderDate":{
"type":"date",
"index":true,
"store":true
},
"OrderName":{
"type":"text",
"index":true,
"store":true,
"analyzer":"whitespace"
},
"OriginalTotalAmount":{
"type":"double",
"index":true,
"store":true
},
"PersonInfoBillTo":{
"type":"object",
"properties":{
"AddressLine1":{
"type":"text",
"index":true
},
"City":{
"type":"keyword",
"index":true,
"store":true,
"fields":{
"asText":{
"type":"text",
"index":true
}
}
},
"EMailID":{
"type":"keyword",
"index":true,
"store":true
}
}
},
"OrderLine":{
"type":"nested",
"properties":{
"ShipNode":{
"type":"keyword",
"index":true,
"store":true
},
"OrderedQty":{
"type":"double",
"index":false,
"store":true
},
"ItemId":{
"type":"keyword",
"index":true,
"store":true
}
}
}
}
}
}
index:true
and
store:true
. This indicates that you can search such fields and also retrieve in the
search result. The field City
is defined with a keyword
type but
duplicated to City.asText’
, which is defined as text type. This helps to perform
exact match of the city name or a case insensitive match of any parts of a city name. The
OrderLine.OrderedQty
field is mapped as index:false
and
store:true
. This means that you do not want to search orders based on the order
line level quantity.orderId : { orderNo, documentType, enterpriseCode, id } -
All the sub fields of
orderId
are searchable and returnable. These fields help you in identifying an order document uniquely.isHistory
– Is a returnable field.
To map a field as non-searchable set index:false
in the mapping specification.
Similarly, by setting store:true
, a field can be made returnable.
By default, the fields with text
type are not sortable. Overriding this impacts
performance, and therefore, not recommended. Instead, use the keyword
data
type.
The fields with other data types are sortable by default that can be made non-searchable by
setting doc_values:false
.
For more information about the various field data types that are available in Elasticsearch, see the Elasticsearch documentation.
Settings
To customize the index behavior, Order Search supports several configurable settings that are provided by Elasticsearch. The settings are classified as dynamic and static based on whether they can be modified after the index is created. The static settings can be set only at index creation time and cannot be modified after that. The dynamic settings can be modified on a live index as well.
For the list of settings provided by Elasticsearch, see the Elasticsearch documentation.
Best practices
The field mappings and the static settings of an index cannot be modified once created. Hence, it is recommended to assess the search requirements and identify the data type of the order outline data available from Sterling Order Management System Software while mapping the index fields. Before creating the index, you must design the mappings and settings with these considerations.
The common mistakes are around the string and object handling. A ‘String’ data has two variants
as mentioned, text
and keyword
. You must select the right mapping
based on your search requirements. The order-related data from the subordinate entities in Sterling Order Management System Software appear in the order outline document as JSON Objects. Some of these entities
have one-to-one relationship with the order. Whereas, the others do have one-to-many relationship.
Elasticsearch supports two variants for JSON object. They are object
and
nested
. Use the object
type for entities that have one-to-one
relationship such as PersonInfoBillTo
provided in the sample mapping. But for
orderLine
, the type must be ‘nested’
A mapping specification is irrevocable, and therefore, if any incorrect mapping is applied there is no way to rectify that. You have to contact your Elasticsearch administrator to dispose the existing index and start the process of mapping and index creation in Order Search and then migrate your order data again from Sterling Order Management System Software. But for history order in Sterling Order Management System Software, it might not be possible to run the migration process again.
Adding fields to an existing index mapping is not a problem as Elasticsearch allows that. You can
plan such additions required to meet your search or business requirements and make changes to the
index by using the updateSearchIndex
API. All new orders and the existing orders in
the index, which are getting modified are indexed with the new fields. But, if you need the new
fields also for the old orders, run the migration process from Sterling Order Management System Software
again.