Configuring query body for Elasticsearch

Use a query body for complex queries where you need to combine multiple search criteria. The query clause helps decide whether a document matches a search to return more precise results.

The integration with Elasticsearch uses keyword search, but you can configure the query body as an advanced Elasticsearch setting in your agents to enable more advanced search techniques, such as:

  • Semantic search with ELSER.
  • k-nearest neighbor (kNN) dense vector search.
  • Nested query to search nested documents.
  • Hybrid search.
  • Search on a semantic text field.

Semantic search with ELSER

Use semantic search with ELSER to enhance search accuracy and relevance by understanding the context and intent behind user queries. For more information, see Semantic search with ELSER Icon for redirecting to external pages. in Elasticsearch documentation.

The following code snippet shows an example of semantic search with ELSER. The example uses a Boolean query Icon for redirecting to external pages. for matching documents that match boolean combinations of other queries.

{
  "query": {
    "bool": {
      "should": [
        {
          "text_expansion": {
            "ml.tokens": {
              "model_id": ".elser_model_2_linux-x86_64",
              "model_text": "$QUERY"
            }
          }
        }
      ],
      "filter": "$FILTER"
    }
  }
}

Where:

  • ml.tokens

    It refers to the field that stores the ELSER tokens. You might need to update it if you use a different field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.

  • .elser_model_2_linux-x86-64

    It is the model ID for the optimized version of ELSER v2, use if it is available in your Elasticsearch deployment. Otherwise, use .elser_model_2 for the regular ELSER v2 model, or .elser_model_1 for ELSER v1.

  • $QUERY

    It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.

  • $FILTER

    It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.

Using a nested query to search over nested documents with ELSER

The nested query Icon for redirecting to external pages. wraps another query to search nested fields within your Elasticsearch index when it contains nested documents. If a match is found, the query returns the root parent document along with the matching nested documents. When applying filters to the search results, you can choose to filter either the parent documents or the nested documents.

Querying outer documents

The following code snippet shows an example for querying the outer documents within an Elasticsearch index.
In the example, the outer document is the main document that contains the passages nested documents. The nested query searches within the passages nested documents.
If a match is found within the nested documents, the query returns the outer document along with the matching nested documents.

{
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "path": "passages",
            "query": {
              "text_expansion": {
                "passages.sparse.tokens": {
                  "model_id": ".elser_model_2_linux-x86_64",
                  "model_text": "$QUERY"
                  }
              }
            },
            "inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
          }
        }
      ],
      "filter": "$FILTER"
    }
  },
  "_source": false
}

Querying inner documents

The following code snippet shows an example for querying the inner documents within nested documents.
In the example, it searches for expanded text within nested documents, applies filters, and returns the parent documents with matching nested documents, excluding certain fields from the source.

{
  "query": {
    "nested": {
      "path": "passages",
      "query": {
        "bool": {
          "should": [
            {
              "text_expansion": {
                "passages.sparse.tokens": {
                  "model_id": ".elser_model_2_linux-x86_64",
                  "model_text": "$QUERY"
                }
              }
            }
          ],
          "filter": "$FILTER"
        }
      },
      "inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
    }
  },
  "_source": false
}

Where:

  • passages

    It is the nested field that stores inner documents within a parent document. You might need to update it if you use a different nested field in your index.

  • passages.sparse.tokens

    It refers to the field that stores the ELSER tokens or raw text for the inner documents. You might need to update it if you use a different nested field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.

  • "inner_hits": {"_source": {"excludes": ["passages.sparse"]}}

    It excludes the ELSER tokens from the inner documents in the search results.

  • "_source": false

    It excludes all the top-level fields in the search results because only the inner documents in the search results are used.

  • $QUERY

    It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.

  • $FILTER

    It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
    If applied on the outer documents, only outer fields are available to use in the filters. If applied on the inner documents, only inner fields are available to use in the filters.

Using a nested query to search on a semantic text field

Use the following structure for querying nested documents on a semantic text field. This structure ensures that the search works properly with the Elasticsearch integration.

{
  "query": {
    "nested": {
      "path": "semtext.inference.chunks",
      "query": {
        "sparse_vector": {
          "field": "semtext.inference.chunks.embeddings",
          "inference_id": ".elser_model_2_linux-x86_64",
          "query": "$QUERY"
        }
      },
      "inner_hits": {"_source": {"excludes": ["semtext.inference.chunks.embeddings"]}}
    }
  },
  "_source": false
}

Where:

  • semtext

    It is the name of the semantic field. You might need to update it if your semantic field has a different name. For more information, see Semantic text field type Icon for redirecting to external pages. in Elasticsearch documentation.

  • semtext.inference.chunks

    It refers to the field that stores the chunked texts and embeddings.

  • sparse_vector

    It specifies the type of the query, in this case, a sparse_vector query. It is a similar, but newer type of query compared to the text_expansion query.

  • semtext.inference.chunks.embeddings

    It refers to the field that stores the embeddings for the chunked texts.