Configuring query body for Elasticsearch
Use a query body for complex queries where you need to combine multiple search criteria. The query clause helps decide whether a document matches a search to return more precise results.
The integration with Elasticsearch uses keyword search, but you can configure the query body as an advanced Elasticsearch setting in your agents to enable more advanced search techniques, such as:
- Semantic search with ELSER.
- k-nearest neighbor (kNN) dense vector search.
- Nested query to search nested documents.
- Hybrid search.
- Search on a semantic text field.
Semantic search with ELSER
Use semantic search with ELSER to enhance search accuracy and relevance by understanding the context and intent behind user queries. For more information, see Semantic search with ELSER
in Elasticsearch documentation.
The following code snippet shows an example of semantic search with ELSER. The example uses a Boolean query
for matching documents that match boolean combinations of other queries.
{
"query": {
"bool": {
"should": [
{
"text_expansion": {
"ml.tokens": {
"model_id": ".elser_model_2_linux-x86_64",
"model_text": "$QUERY"
}
}
}
],
"filter": "$FILTER"
}
}
}
Where:
-
ml.tokensIt refers to the field that stores the ELSER tokens. You might need to update it if you use a different field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.
-
.elser_model_2_linux-x86-64It is the model ID for the optimized version of ELSER v2, use if it is available in your Elasticsearch deployment. Otherwise, use
.elser_model_2for the regular ELSER v2 model, or.elser_model_1for ELSER v1. -
$QUERYIt is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
-
$FILTERIt is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
k-nearest neighbor (kNN) dense vector search
Use kNN search to efficiently find similar items based on vector embeddings, such as text search. For more information, see kNN search
in the Elasticsearch documentation.
The following code snippet shows an example of kNN dense vector search:
{
"knn": {
"field": "text_embedding.predicted_value",
"query_vector_builder": {
"text_embedding": {
"model_id": "intfloat__multilingual-e5-small",
"model_text": "$QUERY"
}
},
"k": 10,
"num_candidates": 100,
"filter" : "$FILTER"
}
}
Where:
-
text_embedding.predicted_valueIt refers to the field that stores the dense vectors. You might need to update it if you use a different field in your index. For more information, see Setting text embeddings for dense vector search in Elasticsearch.
-
text_embeddingunderquery_vector_builderIt is the natural language processing task to run. It must be
text_embeddingfor kNN search. -
intfloat__multilingual-e5-smallIt is the embedding model ID. You might need to update it if you want to use a different embedding model.
-
$QUERYIt is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
-
$FILTERIt is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
Using a nested query to search over nested documents with ELSER
The nested query wraps another query to search nested fields within your Elasticsearch index when it contains nested documents. If a match is found, the query returns the root parent document along with the matching nested documents. When applying filters
to the search results, you can choose to filter either the parent documents or the nested documents.
Querying outer documents
The following code snippet shows an example for querying the outer documents within an Elasticsearch index.
In the example, the outer document is the main document that contains the passages nested documents. The nested query searches within the passages nested documents.
If a match is found within the nested documents, the query returns the outer document along with the matching nested documents.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "passages",
"query": {
"text_expansion": {
"passages.sparse.tokens": {
"model_id": ".elser_model_2_linux-x86_64",
"model_text": "$QUERY"
}
}
},
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
}
}
],
"filter": "$FILTER"
}
},
"_source": false
}
Querying inner documents
The following code snippet shows an example for querying the inner documents within nested documents.
In the example, it searches for expanded text within nested documents, applies filters, and returns the parent documents with matching
nested documents, excluding certain fields from the source.
{
"query": {
"nested": {
"path": "passages",
"query": {
"bool": {
"should": [
{
"text_expansion": {
"passages.sparse.tokens": {
"model_id": ".elser_model_2_linux-x86_64",
"model_text": "$QUERY"
}
}
}
],
"filter": "$FILTER"
}
},
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
}
},
"_source": false
}
Where:
-
passagesIt is the nested field that stores inner documents within a parent document. You might need to update it if you use a different nested field in your index.
-
passages.sparse.tokensIt refers to the field that stores the ELSER tokens or raw text for the inner documents. You might need to update it if you use a different nested field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.
-
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}It excludes the ELSER tokens from the inner documents in the search results.
-
"_source": falseIt excludes all the top-level fields in the search results because only the inner documents in the search results are used.
-
$QUERYIt is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
-
$FILTERIt is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
If applied on the outer documents, only outer fields are available to use in the filters. If applied on the inner documents, only inner fields are available to use in the filters.
Hybrid search with combined keyword search and dense vector search
The following code snippet shows an example of a complex search query on an Elasticsearch index by combining traditional text search with a k-nearest neighbors (kNN) search and ranking:
{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "$QUERY",
"fields": ["$BODY_FIELD_NAME", "$TITLE_FIELD_NAME"],
}
}
],
"filter" : "$FILTER"
}
},
"knn": {
"field": "text_embedding.predicted_value",
"query_vector_builder": {
"text_embedding": {
"model_id": "intfloat__multilingual-e5-small",
"model_text": "$QUERY"
}
},
"k": 10,
"num_candidates": 100,
"filter" : "$FILTER"
},
"rank": {
"rrf": {}
},
"size": 10,
"_source": {"excludes": ["text_embedding.predicted_value"]}
}
Where:
-
text_embedding.predicted_valueIt refers to the field that stores the dense vectors. You might need to update it if you use a different field in your index.
-
text_embeddingunderquery_vector_builderIt is the natural language processing task to perform. It must be
text_embeddingfor kNN search. -
intfloat__multilingual-e5-smallIt is the embedding model ID. You might need to update it if you want to use a different embedding model.
-
$QUERYIt is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
-
$BODY_FIELD_NAMEand$TITLE_FIELD_NAMEThey are the variables for accessing the Body field and Title field configured in the Search integration.
-
$FILTERIt is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
-
rank.rrfIt is the reciprocal rank fusion (rrf) method to combine the search results from keyword search and dense vector search.
-
"_source": {"excludes": ["text_embedding.predicted_value"]}It excludes the unnecessary dense vector field in the search results.
Using a nested query to search on a semantic text field
Use the following structure for querying nested documents on a semantic text field. This structure ensures that the search works properly with the Elasticsearch integration.
{
"query": {
"nested": {
"path": "semtext.inference.chunks",
"query": {
"sparse_vector": {
"field": "semtext.inference.chunks.embeddings",
"inference_id": ".elser_model_2_linux-x86_64",
"query": "$QUERY"
}
},
"inner_hits": {"_source": {"excludes": ["semtext.inference.chunks.embeddings"]}}
}
},
"_source": false
}
Where:
-
semtextIt is the name of the semantic field. You might need to update it if your semantic field has a different name. For more information, see Semantic text field type
in Elasticsearch documentation.
-
semtext.inference.chunksIt refers to the field that stores the chunked texts and embeddings.
-
sparse_vectorIt specifies the type of the query, in this case, a sparse_vector query. It is a similar, but newer type of query compared to the
text_expansionquery. -
semtext.inference.chunks.embeddingsIt refers to the field that stores the embeddings for the chunked texts.