Configuring query body for Elasticsearch

Use a query body for complex queries where you need to combine multiple search criteria. The query clause helps decide whether a document matches a search to return more precise results.

The integration with Elasticsearch uses keyword search, but you can configure the query body as an advanced Elasticsearch setting in your agents to enable more advanced search techniques, such as:

Semantic search with ELSER.
k-nearest neighbor (kNN) dense vector search.
Nested query to search nested documents.
Hybrid search.
Search on a semantic text field.

Semantic search with ELSER

Use semantic search with ELSER to enhance search accuracy and relevance by understanding the context and intent behind user queries. For more information, see Semantic search with ELSER in Elasticsearch documentation.

The following code snippet shows an example of semantic search with ELSER. The example uses a Boolean query for matching documents that match boolean combinations of other queries.

{
  "query": {
    "bool": {
      "should": [
        {
          "text_expansion": {
            "ml.tokens": {
              "model_id": ".elser_model_2_linux-x86_64",
              "model_text": "$QUERY"
            }
          }
        }
      ],
      "filter": "$FILTER"
    }
  }
}

Where:

ml.tokens

It refers to the field that stores the ELSER tokens. You might need to update it if you use a different field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.
.elser_model_2_linux-x86-64

It is the model ID for the optimized version of ELSER v2, use if it is available in your Elasticsearch deployment. Otherwise, use .elser_model_2 for the regular ELSER v2 model, or .elser_model_1 for ELSER v1.
$QUERY

It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$FILTER

It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.

k-nearest neighbor (kNN) dense vector search

Use kNN search to efficiently find similar items based on vector embeddings, such as text search. For more information, see kNN search in the Elasticsearch documentation.

The following code snippet shows an example of kNN dense vector search:

{
  "knn": {
    "field": "text_embedding.predicted_value",
    "query_vector_builder": {
      "text_embedding": {
        "model_id": "intfloat__multilingual-e5-small",
        "model_text": "$QUERY"
      }
    },
    "k": 10,
    "num_candidates": 100,
    "filter" : "$FILTER"
  }
}

Where:

text_embedding.predicted_value

It refers to the field that stores the dense vectors. You might need to update it if you use a different field in your index. For more information, see Setting text embeddings for dense vector search in Elasticsearch.
text_embedding under query_vector_builder

It is the natural language processing task to run. It must be text_embedding for kNN search.
intfloat__multilingual-e5-small

It is the embedding model ID. You might need to update it if you want to use a different embedding model.
$QUERY

It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$FILTER

It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.

Using a nested query to search over nested documents with ELSER

The nested query wraps another query to search nested fields within your Elasticsearch index when it contains nested documents. If a match is found, the query returns the root parent document along with the matching nested documents. When applying filters to the search results, you can choose to filter either the parent documents or the nested documents.

Querying outer documents

The following code snippet shows an example for querying the outer documents within an Elasticsearch index.
In the example, the outer document is the main document that contains the passages nested documents. The nested query searches within the passages nested documents.
If a match is found within the nested documents, the query returns the outer document along with the matching nested documents.

{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "passages",
            "query": {
              "text_expansion": {
                "passages.sparse.tokens": {
                  "model_id": ".elser_model_2_linux-x86_64",
                  "model_text": "$QUERY"
                  }
              }
            },
            "inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
          }
        }
      ],
      "filter": "$FILTER"
    }
  },
  "_source": false
}

Querying inner documents

The following code snippet shows an example for querying the inner documents within nested documents.
In the example, it searches for expanded text within nested documents, applies filters, and returns the parent documents with matching nested documents, excluding certain fields from the source.

{
  "query": {
    "nested": {
      "path": "passages",
      "query": {
        "bool": {
          "should": [
            {
              "text_expansion": {
                "passages.sparse.tokens": {
                  "model_id": ".elser_model_2_linux-x86_64",
                  "model_text": "$QUERY"
                }
              }
            }
          ],
          "filter": "$FILTER"
        }
      },
      "inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
    }
  },
  "_source": false
}

Where:

passages

It is the nested field that stores inner documents within a parent document. You might need to update it if you use a different nested field in your index.
passages.sparse.tokens

It refers to the field that stores the ELSER tokens or raw text for the inner documents. You might need to update it if you use a different nested field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}

It excludes the ELSER tokens from the inner documents in the search results.
"_source": false

It excludes all the top-level fields in the search results because only the inner documents in the search results are used.
$QUERY

It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$FILTER

It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
If applied on the outer documents, only outer fields are available to use in the filters. If applied on the inner documents, only inner fields are available to use in the filters.

Hybrid search with combined keyword search and dense vector search

The following code snippet shows an example of a complex search query on an Elasticsearch index by combining traditional text search with a k-nearest neighbors (kNN) search and ranking:

{
    "query": {
        "bool": {
          "should": [
            {
              "query_string": {
                    "query": "$QUERY",
                    "fields": ["$BODY_FIELD_NAME", "$TITLE_FIELD_NAME"],
                    }
            }
          ],
          "filter" : "$FILTER"
        }
    },
    "knn": {
        "field": "text_embedding.predicted_value",
        "query_vector_builder": {
        "text_embedding": {
            "model_id": "intfloat__multilingual-e5-small",
            "model_text": "$QUERY"
        }
        },
        "k": 10,
        "num_candidates": 100,
        "filter" : "$FILTER"
    },
    "rank": {
        "rrf": {}
    },
    "size": 10,
    "_source": {"excludes": ["text_embedding.predicted_value"]}
}

Where:

text_embedding.predicted_value

It refers to the field that stores the dense vectors. You might need to update it if you use a different field in your index.
text_embedding under query_vector_builder

It is the natural language processing task to perform. It must be text_embedding for kNN search.
intfloat__multilingual-e5-small

It is the embedding model ID. You might need to update it if you want to use a different embedding model.
$QUERY

It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$BODY_FIELD_NAME and $TITLE_FIELD_NAME

They are the variables for accessing the Body field and Title field configured in the Search integration.
$FILTER

It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
rank.rrf

It is the reciprocal rank fusion (rrf) method to combine the search results from keyword search and dense vector search.
"_source": {"excludes": ["text_embedding.predicted_value"]}

It excludes the unnecessary dense vector field in the search results.

Using a nested query to search on a semantic text field

Use the following structure for querying nested documents on a semantic text field. This structure ensures that the search works properly with the Elasticsearch integration.

{
  "query": {
    "nested": {
      "path": "semtext.inference.chunks",
      "query": {
        "sparse_vector": {
          "field": "semtext.inference.chunks.embeddings",
          "inference_id": ".elser_model_2_linux-x86_64",
          "query": "$QUERY"
        }
      },
      "inner_hits": {"_source": {"excludes": ["semtext.inference.chunks.embeddings"]}}
    }
  },
  "_source": false
}

Where:

semtext

It is the name of the semantic field. You might need to update it if your semantic field has a different name. For more information, see Semantic text field type in Elasticsearch documentation.
semtext.inference.chunks

It refers to the field that stores the chunked texts and embeddings.
sparse_vector

It specifies the type of the query, in this case, a sparse_vector query. It is a similar, but newer type of query compared to the text_expansion query.
semtext.inference.chunks.embeddings

It refers to the field that stores the embeddings for the chunked texts.