Resolving reports on full disk or watermark reached
When the disk becomes full, the Elasticsearch and Logstash services generate
TOO_MANY_REQUESTS/12/index read-only
errors in the Elasticsearch and Logstash
runtime log files, forces the cluster to read-only and cannot accept write operations. Essentially,
there is not enough disk space to write any more data to the existing hosts.
For more information about the default Elastic Stack log locations, see Elastic Stack troubleshooting.
Elasticsearch and Logstash services report TOO_MANY_REQUESTS/12/index read-only errors
The Elasticsearch and Logstash services generate TOO_MANY_REQUESTS/12/index read-only errors when the Elasticsearch data instances no longer can write any more data.
A safeguard is in place that sets the read_only_allow_delete
parameter to
true
, when the hard disk space is low that makes the Elasticsearch cluster
read-only and indicates do not accept write operations. As a result, documents are not indexed into
Elasticsearch. You must clean up the storage and verify you have sufficient space. For more
information about increasing disk space, see Configuring Elasticsearch disk usage.
curl --cacert $EGO_TOP/wlp/usr/shared/resources/security/cacert.pem -u $CLUSTERADMIN:$CLUSTERADMINPASS -H'Content-Type: application/json' -XPUT $es_protocol://$es_hostname:$es_port/_settings -d '{"index":{ "blocks":{"read_only_allow_delete":"false"}}}'
The definitions of each variable:es_protocol
- Specifies the protocol for the URL. Use
http
if security is not enabled, or usehttps
if security is enabled. es_hostname
- Specifies the hostname of the Elasticsearch client node.
es_port
- Specifies the port that is used for communication to the Elasticsearch primary node. By default, the port
is
9200
. For more information, see Summary of ports used by IBM Spectrum Conductor.
Elasticsearch service remains in the TENTATIVE state if the cluster is restarted when host disk reaches the high disk watermark
There are Elasticsearch configurations to control disk-based allocation and the reallocation of data from one note to another.
Depending on the values that are configured for these parameters, the Elasticsearch service can hang in the TENTATIVE when the service reaches its parameter limitations. The Elasticsearch service remains in the TENTATIVE state if the cluster is restarted when host disk usage is equal to or higher than the value of the cluster.routing.allocation.disk.watermark.high parameter.
As a best practice, do not restart the cluster when the disk usage reaches or exceeds this high disk watermark. If you restart the cluster and encounter this error, clean up the disk space and verify you have sufficient disk space. For more information about increasing disk space, see Configuring Elasticsearch disk usage.
Follow this high-level troubleshooting process to isolate hosts with this disk usage error:
- Check the Elasticsearch state by using the following
command:
curl --cacert $EGO_TOP/wlp/usr/shared/resources/security/cacert.pem -u $CLUSTERADMIN:$CLUSTERADMINPASS -XGET $es_protocol://$es_hostname:$es_port/_cluster/health?pretty --tlsv1.2
If the cluster is in the red state, the Elasticsearch service remains in the TENTATIVE state until all primary shards are active.
- Run the following command to see all shards and resolve any primary shards that are not in the
STARTED
state:
curl --cacert $EGO_TOP/wlp/usr/shared/resources/security/cacert.pem -u $CLUSTERADMIN:$CLUSTERADMINPASS -XGET $es_protocol://$es_hostname:$es_port/_cat/shards --tlsv1.2
Before a shard can be used, it goes through the INITIALIZING state. If a shard cannot be assigned, the shard remains in the UNASSIGNED state with a reason code. For a list of these reasons that a primary shard might not be started, see Reasons for unassigned shard.