Monitoring Elasticsearch

The Elasticsearch sensor is automatically deployed and installed after you install the Instana agent.

Support information

To make sure that the Elasticsearch sensor is compatible with your current setup, check the following support information sections:

Supported operating systems

The supported operating systems of the Elasticsearch sensor are consistent with host agent requirements, which you can check in the Supported operating systems section of each host agent, such as Supported operating systems for Unix.

Supported versions and support policy

The sensor supports Elasticsearch versions from 0.17.0 to 8.17.1.

The following table shows the latest supported version and support policy:

Table 1. Latest supported version and support policy
Technology	Support policy	Latest technology version	Latest supported version
Elasticsearch	45 days	8.17.1	8.17.1

For more information about the support policy, see Support strategy for sensors.

Supported client-side tracing

For this technology, Instana supports client-side tracing for the following languages and runtimes:

Configuration

Instana automatically monitors up to 1000 indices and collects 5 most important metrics per index. To enable in-depth index monitoring that gathers 20 metrics per index for up to 200 indices, you need to specify indicesRegex in the <agent_install_dir>/etc/instana/configuration.yaml agent configuration file as shown:

com.instana.plugin.elasticsearch:
  enabled: true
  indicesRegex: '<INSERT_INDEX_REGEX_HERE>' # eg. 'env-prod.*'

Metrics collection

To view the metrics, complete the following steps:

From the navigation menu of the Instana UI, select Infrastructure.
Click a specific monitored host where Elasticsearch is installed.

You can see host dashboard with the following performance metrics, configuration data, and health signatures.

Node-Level

Configuration data

Version
Cluster
Health Status
Node Name
Node Type
Node is Master
Node is Master Eligible
Transport
HTTP
Log Directory
Shards
Indices

Performance metrics

Data point	Description	Granularity
Query Latency	The query latency is collected from `NodeIndicesStats#SearchStats`.	1 second
Number of Queries	The query count per second is collected from `NodeIndicesStats#SearchStats`.	1 second
Overall Documents	The total number of documents is collected from `DocsStats#count`.	1 second
Added Documents	The total number of indexing operations is collected from `IndexingStats#indexCount`.	1 second
Removed Documents	The number of delete operations that are executed is collected from `IndexingStats#deleteCount`.	1 second
Active Shards	The number of active shards is collected from `IndexRoutingTable#ShardRouting`.	1 second
Active Primary Shards	The number of active primary shards is collected from `IndexRoutingTable#ShardRouting`.	1 second
Refresh Count	The number of refreshes that are executed per second is collected from `NodeIndicesStats#RefreshStats`.	1 second
Refresh Time	The total time merges that are executed is collected from `NodeIndicesStats#RefreshStats`.	1 second
Flush Count	The total number of flushes that are executed per second is collected from `NodeIndicesStats#FlushStats`.	1 second
Flush Time	The total time merges that are executed is collected from `NodeIndicesStats#FlushStats`.	1 second
Indices metrics	Documents count, Deleted count, and Size per index is collected from `IndexStats#DocsStats`.	1 second
Lucene Segments	The number of segments is collected from `NodeIndicesStats#SegmentsStats#count`.	1 second
Active Threads	Search, Index, Bulk, Merge, Flush, Get, Management, Refresh are collected from `ThreadPoolStats.Stats#active`.	1 second
Queued Threads	Search, Index, Bulk, Merge, Flush, Get, Management, Refresh are collected from `ThreadPoolStats.Stats#queue`.	1 second
Rejected Threads	Search, Index, Bulk, Get are collected from `ThreadPoolStats.Stats#rejected`.	1 second
Sent Data	The size of TX packets that are sent by the node during internal cluster communication is collected from `TransportStats#tx_size`	1 second
Received Data	The size of RX packets that are received by the node during internal cluster communication is collected from `TransportStats#rx_size`	1 second

Index metrics

Data point	Description	Granularity
Total Queries	The total number of query operations is collected from `SearchStats.Stats#queryTotal`	1 second
Queries Current	The number of query operations that are currently running is collected from `SearchStats.Stats#queryCurrent`	1 second
Fetches Total	The total number of fetch operations is collected from `SearchStats.Stats#fetchCount`	1 second
Fetches Current	The number of fetch operations that are currently running is collected from `SearchStats.Stats#fetchCurrent`	1 second
Query Time	Time in milliseconds that is spent in executing query operations is collected from `SearchStats.Stats#queryTimeInMillis`	1 second
Fetch Time	Time in milliseconds that is spent in executing fetch operations is collected from `SearchStats.Stats#fetchTimeInMillis`	1 second
Query Cache Evictions	The number of query cache evictions is collected from `QueryCacheStats#evictions`	1 second
Request Cache Evictions	The number of cache eviction requests is collected from `RequestCacheStats#evictions`	1 second
Get Requests	The total number of Get requests is collected from `GetStats#count`	1 second
Get Requests Time	Time in milliseconds spent on Get requests is collected from `GetStats#timeInMillis`	1 second
Failed Get Requests	The number of failed Get requests is collected from `GetStats#missingCount`	1 second
Failed Get Requests Time	Time in milliseconds that is spent on failed Get requests is collected from `GetStats#missingTimeInMillis`	1 second
Indexing Operations Failed	The number of failed indexing operations is collected from `IndexingStats#indexFailedCount`	1 second
Active Merges Count	The current number of merges that are executed is collected from `MergeStats#current`	1 second
Total Merges Size	The total size of merges that are executed is collected from `MergeStats#totalSizeInBytes`	1 second
Total Merges Time	The total time for merges that are executed is collected from `MergeStats#totalTimeInMillis`	1 second

Index metrics that are mentioned in the Index metrics section are enabled for indices that are configured through indicesRegex regular expression in the agent configuration.

Health Signatures

Each sensor has a curated knowledgebase of health signatures that are evaluated continuously against the incoming metrics and are used to raise issues or incidents that depend on user impact.

Built-in events trigger issues or incidents based on failing health signatures on entities, and custom events trigger issues or incidents based on the thresholds of an individual metric of any specific entity.

For more information about built-events for the Elasticsearch node, see the Built-in events reference.

Cluster-Level

Configuration data

Name
Health Status
Nodes, Masters

Performance metrics

Data point	Description	Granularity
Query Latency	The query latency is calculated as the maximum query latency of all nodes.	1 second
Number of Queries	The query count is calculated as the sum of the query count for all nodes.	1 second
Overall Documents	The Overall Documents is calculated as the sum of overall documents for all nodes.	1 second
Added Documents	The sum of the documents that are added for all nodes.	1 second
Removed Documents	The sum of the documents that are removed for all nodes.	1 second
Indices	Number of indices	1 second
Shards	Active, Active Primary, Initializing, Relocating, Unassigned are collected from `ClusterHealth`.	1 second
Cluster State size	The size of the `ClusterState`.	1 second

Health Signatures

For more information about built-events for the Elasticsearch cluster, see the Built-in events reference.