Configuring query body for Elasticsearch
Use a query body for complex queries where you need to combine multiple search criteria. The query clause helps decide whether a document matches a search to return more precise results.
The integration with Elasticsearch uses keyword search, but you can configure the query body as an advanced Elasticsearch setting in your agents to enable more advanced search techniques, such as:
Semantic search with ELSER.
k-nearest neighbor (kNN) dense vector search.
Nested query to search nested documents.
Hybrid search.
Search on a semantic text field.
Semantic search with ELSER
Use semantic search with ELSER to enhance search accuracy and relevance by understanding the context and intent behind user queries. For more information, see Semantic search with ELSER in Elasticsearch documentation.
The following code snippet shows an example of semantic search with ELSER. The example uses a Boolean query for matching documents that match boolean combinations of other queries.
{
"query": {
"bool": {
"should": [
{
"text_expansion": {
"ml.tokens": {
"model_id": ".elser_model_2_linux-x86_64",
"model_text": "$QUERY"
}
}
}
],
"filter": "$FILTER"
}
}
}
Where:
ml.tokens
It refers to the field that stores the ELSER tokens. You might need to update it if you use a different field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.
.elser_model_2_linux-x86-64
It is the model ID for the optimized version of ELSER v2, use if it is available in your Elasticsearch deployment. Otherwise, use .elser_model_2 for the regular ELSER v2 model, or .elser_model_1 for ELSER v1.
$QUERY
It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$FILTER
It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
k-nearest neighbor (kNN) dense vector search
Use kNN search to efficiently find similar items based on vector embeddings, such as text search. For more information, see kNN search in the Elasticsearch documentation.
The following code snippet shows an example of kNN dense vector search:
{
"knn": {
"field": "text_embedding.predicted_value",
"query_vector_builder": {
"text_embedding": {
"model_id": "intfloat__multilingual-e5-small",
"model_text": "$QUERY"
}
},
"k": 10,
"num_candidates": 100,
"filter" : "$FILTER"
}
}
Where:
text_embedding.predicted_value
It refers to the field that stores the dense vectors. You might need to update it if you use a different field in your index. For more information, see Setting text embeddings for dense vector search in Elasticsearch.
text_embeddingunderquery_vector_builder
It is the natural language processing task to run. It must be text_embedding for kNN search.
intfloat__multilingual-e5-small
It is the embedding model ID. You might need to update it if you want to use a different embedding model.
$QUERY
It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$FILTER
It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
Using a nested query to search over nested documents with ELSER
The nested query wraps another query to search nested fields within your Elasticsearch index when it contains nested documents. If a match is found, the query returns the root parent document along with the matching nested documents. When applying filters to the search results, you can choose to filter either the parent documents or the nested documents.
Querying outer documents
The following code snippet shows an example for querying the outer documents within an Elasticsearch index. In the example, the outer document is the main document that contains the passages nested documents. The nested query searches within the passages nested documents. If a match is found within the nested documents, the query returns the outer document along with the matching nested documents.
{
"query": {
"bool": {
"must": [
{
"nested": {
"path": "passages",
"query": {
"text_expansion": {
"passages.sparse.tokens": {
"model_id": ".elser_model_2_linux-x86_64",
"model_text": "$QUERY"
}
}
},
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
}
}
],
"filter": "$FILTER"
}
},
"_source": false
}
Querying inner documents
The following code snippet shows an example for querying the inner documents within nested documents. In the example, it searches for expanded text within nested documents, applies filters, and returns the parent documents with matching nested documents, excluding certain fields from the source.
{
"query": {
"nested": {
"path": "passages",
"query": {
"bool": {
"should": [
{
"text_expansion": {
"passages.sparse.tokens": {
"model_id": ".elser_model_2_linux-x86_64",
"model_text": "$QUERY"
}
}
}
],
"filter": "$FILTER"
}
},
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
}
},
"_source": false
}
Where:
passages
It is the nested field that stores inner documents within a parent document. You might need to update it if you use a different nested field in your index.
passages.sparse.tokens
It refers to the field that stores the ELSER tokens or raw text for the inner documents. You might need to update it if you use a different nested field in your index. If the ELSER tokens are not available, you can also use the field that contains the raw text, but the search quality might degrade.
"inner_hits": {"_source": {"excludes": ["passages.sparse"]}}
It excludes the ELSER tokens from the inner documents in the search results.
"_source": false
It excludes all the top-level fields in the search results because only the inner documents in the search results are used.
$QUERY
It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$FILTER
It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body. If applied on the outer documents, only outer fields are available to use in the filters. If applied on the inner documents, only inner fields are available to use in the filters.
Hybrid search with combined keyword search and dense vector search
The following code snippet shows an example of a complex search query on an Elasticsearch index by combining traditional text search with a k-nearest neighbors (kNN) search and ranking:
{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "$QUERY",
"fields": ["$BODY_FIELD_NAME", "$TITLE_FIELD_NAME"],
}
}
],
"filter" : "$FILTER"
}
},
"knn": {
"field": "text_embedding.predicted_value",
"query_vector_builder": {
"text_embedding": {
"model_id": "intfloat__multilingual-e5-small",
"model_text": "$QUERY"
}
},
"k": 10,
"num_candidates": 100,
"filter" : "$FILTER"
},
"rank": {
"rrf": {}
},
"size": 10,
"_source": {"excludes": ["text_embedding.predicted_value"]}
}
Where:
text_embedding.predicted_value
It refers to the field that stores the dense vectors. You might need to update it if you use a different field in your index.
text_embeddingunderquery_vector_builder
It is the natural language processing task to perform. It must be text_embedding for kNN search.
intfloat__multilingual-e5-small
It is the embedding model ID. You might need to update it if you want to use a different embedding model.
$QUERY
It is the variable for accessing the user query. It can ensure that the user query is passed to the query body.
$BODY_FIELD_NAMEand$TITLE_FIELD_NAME
They are the variables for accessing the Body field and Title field configured in the Search integration.
$FILTER
It is the variable for accessing the custom filtersthat you configure in the advanced settings for Elasticsearch. It can ensure that the custom filters are used in the query body.
rank.rrf
It is the reciprocal rank fusion (rrf) method to combine the search results from keyword search and dense vector search.
"_source": {"excludes": ["text_embedding.predicted_value"]}
It excludes the unnecessary dense vector field in the search results.
Using a nested query to search on a semantic text field
Use the following structure for querying nested documents on a semantic text field. This structure ensures that the search works properly with the Elasticsearch integration.
{
"query": {
"nested": {
"path": "semtext.inference.chunks",
"query": {
"sparse_vector": {
"field": "semtext.inference.chunks.embeddings",
"inference_id": ".elser_model_2_linux-x86_64",
"query": "$QUERY"
}
},
"inner_hits": {"_source": {"excludes": ["semtext.inference.chunks.embeddings"]}}
}
},
"_source": false
}
Where:
semtext
It is the name of the semantic field. You might need to update it if your semantic field has a different name. For more information, see Semantic text field type in Elasticsearch documentation.
semtext.inference.chunks
It refers to the field that stores the chunked texts and embeddings.
sparse_vector
It specifies the type of the query, in this case, a sparse_vector query. It is a similar, but newer type of query compared to the text_expansion query.
semtext.inference.chunks.embeddings
It refers to the field that stores the embeddings for the chunked texts.