Extended features of the TERMS aggregator

You can extend the capability of the TERMS aggregator to refine your search results.

resultSize and fetchSize

When save action for order documents is performed on an index that is configured with multiple shards, the order documents get saved across shards. When the TERMS aggregation is performed on an index with multiple shards, where the documents have been saved across shards, each shard provides its own view of what the ordered list of terms should be. These results from each shard are combined to get the final view, where only top 10 results (default size for aggregation) among those combined results are returned. It means that there is a high probability of more aggregation results being available at the time of final view calculation, but not being fetched due to default size limit. In such a case, you can fetch all available results by passing a higher resultSize in a subsequent aggregation query. Sometimes, the document count (and the results of Subaggregation) in the TERMS aggregation is not accurate. In such a case, use a higher fetchSize in a subsequent aggregation query.

Furthermore, each shard view is restricted by the shard-size factor. Therefore, when the final view is calculated, it includes only the result as per the shard-size factor from each shard, and not all available results. It means that there is a high probability of more results being available on each shard but not being returned for calculation of the final view due to shard size limit. This might results in a slightly incorrect document count in case of TERMS aggregation. In such a case, to refine the aggregation query results and improve the accuracy for document count, you can pass a higher fetchSize in a subsequent aggregation query.

Refer to the following sample query.

{
        "query": {
          "match": [
            {
              "condition": "MUST",
              "field": "OriginalTotalAmount",
              "value":"1",
              "operator":"gt"
            }
          ]
        },
        "aggregate":[
          {
            "field":"CustomerContactID",
            "name":"contact",
            "type":"TERMS",
            "resultSize": 12,
            "fetchSize":14
          }
        ]
      }

Filter TERMS bucket

You can filter the values for which buckets are created in the TERMS aggregator results. To do so, use the include and exclude parameters.

Include

You can use the include parameter to provide an array of the exact values such that only those buckets that match any of the included values is included in the TERMS aggregator results.

For example, the following query fetches all the orders that are shipped from Node2, which is OrderLine.ShipNode, and displays only the count of the orders that have OriginalTotalAmount set as either 100.48 or 30.
{
    "query": {
      "match": [
        {
          "condition": "MUST",
          "field": "OrderLine.ShipNode",
          "value":"Node2",
          "operator":"eq"
        }
      ]
    },
    "aggregate":[
      {
        "field":"OriginalTotalAmount",
        "name":"OrderPrice",
        "include":[100.48,30]
      }
    ]
  }

Refer to the following response.

"aggregation": [
    {
      "field": "OriginalTotalAmount",
      "type": "TERMS",
      "name": "OrderPrice",
      "result": [
        {
          "value": "100.48",
          "count": 1
        }
      ]
    }
  ]
Exclude
When you provide the exclude parameter with array of exact values, the buckets matching the any of those values are excluded from the TERMS aggregator results. For example, the following query fetches all the orders shipped from Node2, that is, OrderLine.ShipNode, and displays only the count of those orders that do not have OriginalTotalAmount set as either 100.48 or 30.
{
    "query": {
      "match": [
        {
          "condition": "MUST",
          "field": "OrderLine.ShipNode",
          "value":"Node2",
          "operator":"eq"
        }
      ]
    },
    "aggregate":[
      {
        "field":"OriginalTotalAmount",
        "name":"OrderPrice",
        "exclude":[100.48,30]
      }
    ]
  }
Refer to the following response.
 "aggregation": [
    {
      "field": "OriginalTotalAmount",
      "type": "TERMS",
      "name": "OrderPrice",
      "result": [
        {
          "value": "400.0",
          "count": 1
        }
      ]
    }
  ]
Important: You can use the include and exclude parameters together. In such a case, the exclude parameter takes the precedence.

Sorting TERMS aggregator

You can customize the order of the TERMS aggregator results by setting the sort parameter. You can do this in three different ways.

Arranging the results in the ascending order
By default, the TERMS aggregator displays the results in the descending order of the document count. To arrange the results displayed in the ascending order, refer to this sub-section.
{
        "query": {
          "match": [
            {
              "condition": "MUST",
              "field": "OriginalTotalAmount",
              "value":"10",
              "operator":"gt"
            }
          ]
        },
        "aggregate":[
           {
            "field":"OrderLine.PersonInfoShipTo.City",
            "type":"TERMS",
            "name":"ShipToCity",
             "sort": {
               "field": "_count",
               "type": "ASC"
             }
          }
        ]
       }
Refer to the following response.
"aggregation": [
        {
          "field": "OrderLine.PersonInfoShipTo.City",
          "type": "TERMS",
          "name": "ShipToCity",
          "result": [
            {
              "value": "London",
              "count": 1
            },
            {
              "value": "Mumbai",
              "count": 1
            },
            {
              "value": "Pune",
              "count": 1
            },
            {
              "value": "Littleton",
              "count": 3
            }
          ]
        }
      ]
CAUTION:
To change the order of the results displayed based on the document count, the sort field must be set to _count. Or, the query results in an error.
Arranging the results alphabetically by their value in the ascending order
Refer to the following sample query to fetch the results in the ascending order of value in the value-count bucket.
{
         "query": {
           "match": [
             {
               "condition": "MUST",
               "field": "OriginalTotalAmount",
               "value":"10",
               "operator":"gt"
             }
           ]
         },
         "aggregate":[
            {
             "field":"OrderLine.PersonInfoShipTo.City",
             "type":"TERMS",
             "name":"ShipToCity",
              "sort": {
                "field": "_key",
                "type": "ASC"
              }
           }
         ]
        }
Refer to the following response.
"aggregation": [
            {
              "field": "OrderLine.PersonInfoShipTo.City",
              "type": "TERMS",
              "name": "ShipToCity",
              "result": [
                {
                  "value": "Littleton",
                  "count": 3
                },
                {
                  "value": "London",
                  "count": 1
                },
                {
                  "value": "Mumbai",
                  "count": 1
                },
                {
                  "value": "Pune",
                  "count": 1
                }
              ]
            }
          ]
CAUTION:
To change the order of the results displayed based on the document value, the sort field must be set to _key. Or, the query results in an error.
Arranging the results by subaggregation
Restriction: When ordering the TERMS aggregator results by subaggregation, the subaggregation must be of Metric category.
The following query sorts order documents in the criteria laid by the parameter defined at the subaggregation level. The orders that have the highest OriginalTotalAmount rank higher and, consequently, display first.
{
        "query": {
          "match": [
            {
              "condition": "MUST",
              "field": "OriginalTotalAmount",
              "value":"10",
              "operator":"gt"
            }
          ]
        },
        "aggregate":[
           {
            "field":"OrderLine.PersonInfoShipTo.City",
            "type":"TERMS",
            "name":"ShipToCity",
             "sort": {
               "field": "OriginalTotalAmount",
               "type": "DESC"
             },
            "aggregate": {
             "field": "OriginalTotalAmount",
             "type": "MAX"
           }
          }
        ]
       }
Refer to the following response.
"aggregation": [
        {
          "field": "OrderLine.PersonInfoShipTo.City",
          "type": "TERMS",
          "name": "ShipToCity",
          "result": [
            {
              "value": "Pune",
              "count": 1,
              "aggregation": {
                "field": "OriginalTotalAmount",
                "numAgg": 500
              }
            },
            {
              "value": "Mumbai",
              "count": 1,
              "aggregation": {
                "field": "OriginalTotalAmount",
                "numAgg": 400
              }
            },
            {
              "value": "Littleton",
              "count": 3,
              "aggregation": {
                "field": "OriginalTotalAmount",
                "numAgg": 100.48
              }
            },
            {
              "value": "London",
              "count": 1,
              "aggregation": {
                "field": "OriginalTotalAmount",
                "numAgg": 100.48
              }
            }
          ]
        }
      ]
Note: Ensure that the sort field name matches the name of the field for which subaggregation is applied.