Operational data store

IBM® Automation foundation includes an operational data store based on Apache 2.0 OSS Elasticsearch with the addition of a custom security plug-in to enable Basic Authentication and a proxy sidecar for TLS capability.

Warning: The disks are not monitored for usage and it is the user’s responsibility to plan for the appropriate disk usage and monitor the disks to ensure they do not fill. If the disks get full, it is likely that it results in data loss.

Running in production

When you run Elasticsearch in production, consider these settings. For more information, see Important System Configuration External link icon .

Ensure that sufficient virtual memory exists on the nodes. To configure virtual memory, use the node tuning Operator in OpenShift. Set the value of vm.max_map_count to at least 262144. - For more information, see Using the Node Tuning Operator .
Ensure that the Elasticsearch container JVM is configured with sufficient heap. For more information, see JVM Options.
For production environments, persistent storage is required to ensure resilient operation. Cluster scaling is only supported with persistence enabled by using compatible block storage and a minimum of three Elasticsearch nodes are recommended for resilience and availability. Single node deployments should be reserved for proof of concept deployments only.

Node groups

Elasticsearch is configured into node groups inside the Elasticsearch CR:

 spec:
   elasticsearch:
     nodegroupspecs:
       - name: master
         replicas: 3
         storage: {}
         config:
           - key: node.master
             value: "true"
           - key: node.data
             value: "false"
           - key: node.ingest
             value: "false"
       - name: data
         replicas: 3
         storage: {}
         config:
           - key: node.master
             value: "false"
           - key: node.data
             value: "true"
           - key: node.ingest
             value: "false"

These node groups allow the user to define the configuration for different sets of nodes within the Elasticsearch cluster.

The names of these node groups are significant as it defines the names of Kubernetes resources, which are created for installing the Elasticsearch cluster. When a node group is removed, all the resources, except for the PersistentVolumeClaims (used for persistent storage under those nodes), are removed. This means that if you change the name of a group, it leads to the removal of the previous group and the creation of a new group. Therefore, you must consider the node groups carefully before installing or adjusting the configuration.

Security

The Elasticsearch cluster included in Automation foundation uses a custom security plug-in that secures all API interactions with Basic Authentication. A CartridgeRequirements Custom Resource (CR) has its status that is updated regarding a secret that contains the username and generated password that is assigned to the CartridgeRequirements request.

A superuser credential is created for administration of the Elasticsearch instance that can access the security APIs.

A warning message in the status section of the Elasticsearch CR prompts the administrator who creates the Elasticsearch instance to update the generated password.
To update the password, the administrator needs to update the corresponding secret's password value and remove the annotation.

All communication with the client port of Elasticsearch is encrypted with TLS. The default TLS configuration can be changed by using one of the following methods:

Configure TLS to use selfsigned generated certificates with spec.elasticsearch.tls: {}.
Specify a cert-manager Issuer with a reference to the certificate authority (CA)'s public certificate that is used to verify the provided issuer as shown in the following example:

spec:
  elasticsearch:
    tls:
      issuer:
        name: my-issuer
      caSecret:
        secretName: my-ca-secret
        key: ca.crt

You can customize Elasticsearch user credentials. Currently, the behavior is as following.

If the user wants a particular combination of user name and password to be used for the Elasticsearch instance, then they must create a secret with name <cloudpakName>-es-auth with key-value pairs as follows:
```
  {"username": "(your-username)", "password": "(your-password)"}
```
After this when the user applies a CartridgeRequirements CR, the reconcile picks the credentials that are provided in the secret to create the Elasticsearch user account.
If the user does not provide any secret as stated previously, then the Elasticsearch user credentials are auto-generated, username is the Cloud Pak name and password is an auto generated alphanumeric string.
If the user wants to update the created credentials, then they can modify the password in the secret <cloudpakName>-es-auth. It takes around 30 secs for the updated password to reflect in the Elasticsearch instance. Note: After the credentials are created, the username must not be changed. Only the password can be modified.

Updating and preconfiguring the superuser password

Follow the process to change the generated password for the elasticsearch-admin superuser:

Ensure you are logged in to the OpenShift cluster by using the oc login.

Update and export the following environment variables in your command line.

 export ELASTIC_INSTANCE_NAME="elasticsearch-sample" # <-- Change this variable to your CR name.
 export NAMESPACE="acme-abp" # <-- Change this variable to the namespace that you want.
 export NEW_PASSWORD="your-new-password" # <-- Change this variable to the new password that you want.

Run the following code block in your command line to update the superuser password:

 export SECRET_NAME=$(oc get elasticsearch "$ELASTIC_INSTANCE_NAME" -n "$NAMESPACE" -o jsonpath='{.status.adminAuthSecretName}')
 oc patch secret $SECRET_NAME -n $NAMESPACE -p '{"data": {"password": "'$(echo -n "$NEW_PASSWORD" | base64)'"}}'
 oc annotate secret $SECRET_NAME -n $NAMESPACE elastic.automation.ibm.com/generated-default-credentials-
 unset NEW_PASSWORD

Alternatively, to preconfigure the superuser password, a secret can be created with the appropriate naming and labeling conventions in advance. Creating a secret in advance can be useful in a disaster recovery scenario where dependent services have an existing set of credentials.

Ensure you are logged in to the OpenShift cluster by using the oc login.

Update and export the following environment variables in your command line.

 export ELASTIC_INSTANCE_NAME="elasticsearch-sample" # <-- Change this variable to your CR name.
 export NAMESPACE="acme-abp" # <-- Change this variable to the namespace that you want.
 export NEW_PASSWORD="your-new-password" # <-- Change this variable to the new password that you want.

Run the following code block in your command line to preconfigure the superuser password:

 cat <<EOF | oc apply -f -
 kind: Secret
 apiVersion: v1
 metadata:
   name: ${ELASTIC_INSTANCE_NAME}-elasticsearch-es-default-user
   namespace: $NAMESPACE
   labels:
     app.kubernetes.io/component: es
     app.kubernetes.io/instance: $ELASTIC_INSTANCE_NAME
     app.kubernetes.io/name: elasticsearch
     elastic.automation.ibm.com/cr-name: $ELASTIC_INSTANCE_NAME
 data:
   password: $(echo -n "$NEW_PASSWORD" | base64 -w0)
   username: ZWxhc3RpY3NlYXJjaC1hZG1pbg==
 type: kubernetes.io/basic-auth
 EOF
 unset NEW_PASSWORD

Note: This precreated credentials secret is not owned by the Elasticsearch instance and as such is not tethered to the Elasticsearch instance lifecycle. Users are responsible for managing the lifecycle of this secret.

Storage

By default the Automation foundation provided Elasticsearch cluster does not have persistence that is configured. Cluster administrators are required to provide either a StorageClass supporting dynamic provisioning or pre-created PersistentVolumes before configuring Elasticsearch. There are multiple PersistentVolume storage classes available, depending on your cluster setup. For more information, see Understanding Persistent Storage.

Each node group requires independent storage configuration. This approach enables different tiers of storage capability to be provided to each node group depending on requirements that is, fast storage for the data nodes and slower storage for the controller nodes.

The controller and data nodes require storage that can be used with a ReadWriteOnce (RWO) access mode. This mode specifies that the volume can be mounted as read and write by a single node and is available only to that node.

To use the Elasticsearch cluster data storage, you must declare its use in the elasticsearch section of the AutomationBase custom resource by defining a storage element for each nodegroupspecs.

For example,

spec:
  elasticsearch:
    nodegroupspecs:
      - name: data
        replicas: 3
        storage:
          size: 50Gi
          class: rook-ceph-block

This example creates a 50 GB PersistentVolumeClaim for each of the three replicas where the physical storage for each of the data nodes is provisioned by using the rook-ceph-block StorageClass.

The following table shows the child elements that can be specified as part of the storage object. All elements are optional. If the element is not provided, and a default value is set for the cluster, then that default value is used. Otherwise, the element is not included.

Element	Default	Description
size	data: 50 Gi, controller: 10 Gi	Size of the storage with scale suffix.
class	default cluster StorageClass	Storage class name.
selector		A Label selector to allow finer-grained PersistentVolume selection. See https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#label-selectors for syntax.
volumeClaimTemplate		A PersistentVolumeClaim that allows a greater detailed specification of the volume. Only used for Snapshot storage. See https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims for syntax.
fsGroup		The group ID for the file system. May need to be set for some storage providers such as NFS
supplementalGroups		An array of group IDs to be added on the security context for the container.

If spec.elasticsearch.nodegroupspecs[].storage.class is not specified, and a default StorageClass is set for the cluster, the default StorageClass is used.

If you don't want a StorageClass to be used because your PersistentVolumes do not define a StorageClass, like NFS, then specify spec.elasticsearch.nodegroupspecs[].storage.class with a value of "", as in class: "".

If StorageClasses support dynamic provisioning
- Some StorageClasses support dynamic provisioning. If the StorageClass in use supports dynamic provisioning, then a PersistentVolume is dynamically created when the PersistentVolumeClaim is created. Otherwise, the PersistentVolume needed to be created before the Elasticsearch storage is defined.
- On OpenShift, all dynamically provisioned volumes are created with the RECLAIMPOLICY set to Delete by default. Thus, the volume lasts only while the claim still exists in the system. If you delete the claim, the volume is also deleted, and all data on the volume is lost.
If StorageClasses do not support dynamic provisioning
- If the storage class doesn't support dynamic provisioning, for example, NFS, or if the PersistentVolumes are already created for use with Elasticsearch, the PersistentVolumes need to be referenced from the PersistentVolumeClaim that gets created by the Elasticsearch operator. Use the storage.selector or storage.volumeClaimTemplate child elements of the spec.elasticsearch.nodegroupspecs[] element in the AutomationBase custom resource.

Note: The storage.volumeClaimTemplate child element can be used only when you configure the Elasticsearch storage snapshot.

Create PersistentVolumes by using a label so that they can be correctly defined in the PersistentVolumeClaim, as shown in the following example:

apiVersion: v1
kind: PersistentVolume
metadata:
  name:  elasticsearch-data1
  labels:
    es-storage-type: data
spec:
  capacity:
    storage: 50Gi
  accessModes:
  - ReadWriteOnce
  nfs:
    path: /data/es-data1
    server: 192.168.1.10
  persistentVolumeReclaimPolicy: Recycle

See the following example of the spec.elasticsearch.nodegroupspecs[].storage that uses a selector that matches the es-storage-type: data label:

spec:
  elasticsearch:
    license:
      accept: true
    version: "v1.0"
    nodegroupspecs:
      - name: data
        replicas: 3
        storage:
          class: ""
          selector:
            matchLabels:
              es-storage-type: data
          size: 50Gi

Sufficient PersistentVolumes must be available to meet the needs of the replicas that are defined for each node group. In the previous example, three replicas are defined for the data node group. Thus, at least three PersistentVolumes of sufficient capacity must be available to meet the claim needs.

Note: If separate data and master nodes are being used with persistent storage, then a spec.elasticsearch.nodegroupspecs[].storage object is required for both nodegroups. It is not possible to define persistent storage just for the data nodes. However, different tiers of storage might be assigned to each node group.

If no storage is defined, the Elasticsearch cluster uses only the storage that is local to the container. All indices and configurations are lost when an Elasticsearch container restarts.

When an Elasticsearch cluster is deleted the PersistentVolumeClaims and bound PersistentVolumes remain intact to preserve the data.

Note: If a new Elasticsearch cluster that uses persisted storage is created with the same name as a previous cluster, then any previous PersistentVolumeClaims and hence PersistentVolumes are reused. If the storage object that is defined on the new cluster is changed, for example the StorageClass differs, then the PersistentVolumeClaims and PersistentVolumes need to be manually deleted before the creation of the new Elasticsearch cluster of the same name in order for new PersistentVolumeClaims to be created.

Access modes

The Elasticsearch cluster requires storage that uses the following access modes:

The master and data nodes require storage that uses a ReadWriteOnce (RWO) access mode that specifies that the volume can be mounted as read/write by a single node.
Snapshot storage requires a ReadWriteMany (RWX) access mode that specifies that the volume can be mounted as read/write by many nodes.

Only PersistentVolume mechanisms that support these access modes can be used for the different storage uses.

For more information, see Access modes.

Storage permissions

To access provided storage, workloads require permissions. These permissions are controlled with the securityContext for the pod, in this case, the fsGroup, and supplementalGroups. When the securityContext has an fsGroup that is specified, all processes of the containers within the pod are part of the supplementary group ID that is specified by the fsGroup. The owner for the persisted volume and any files that are created in that volume are also in the group ID that is specified by the fsGroup.

The default settings are valid for most scenarios. However, if your storage configuration requires fsGroup or supplementalGroups, use the Storage object to configure each node group.

If you configure the storage.fsGroup element on the nodegroupspec, this sets the spec.securityContext.fsGroup element on the pods.
If you configure the storage.supplementalGroups element on the nodegroupspec, this sets the spec.securityContext.supplementalGroups element on the pods.

Note: The spec.securityContext.runAsGroup cannot be set as part of the storage object, so the default primary group ID for all containers is 0 (root).

The following example demonstrates setting an fsGroup and supplementalGroups in the elasticsearch field of the AutomationBase custom resource for a node group:

spec:
  elasticsearch:
    nodegroupspecs:
      - name: data
        replicas: 3
        storage:
          class: rook-ceph-block
          size: 50Gi
          fsGroup: 2000
          supplementalGroups: [2001,2002]

The three pods that result from the previous node group definition each have a securityContext as shown:

spec:
  securityContext:
    runAsNonRoot: true
    fsGroup: 2000
    supplementalGroups: [2001,2002]

If you run the id command within the pod's container, you can confirm if the configurations took effect:

$ id
uid=1000660000(1000660000) gid=0(root) groups=0(root),2000,2001,2002

Note: If fsGroup or supplementalGroups are required, you might need to provide a SecurityContextConstraint that is configured to support the specified values.

See the following example of the settings that are required in a SecurityContextConstraint to accommodate these configurations:

fsGroup:
  ranges:
    - max: 3000
      min: 2000
  type: MustRunAs
supplementalGroups:
  type: RunAsAny

Additional Allowed APIs

Use the additional allowed APIs (additionalAllowedAPIs) to allow a predefined set of APIs and optionally a broader set of user-specified APIs. Configure this list with the spec.elasticsearch.additionalAllowedAPIs field in the AutomationBase CR.

Note: More APIs are allowed at your own risk!

Follow this format: GET:[api, api_two],POST:[api_three]...

Add the method in capital letters followed by a colon (:).
Then, surrounded by brackets, [], add the list of API names.
Separate the list items with commas.

For an example, see the default allowlist:

"GET:[main_action, cat_health_action, nodes_stats_action, get_snapshots_action, snapshot_status_action, recovery_action, cat_recovery_action, get_indices_action, search_action],HEAD:[main_action, get_indices_action],POST:[rest_handler_security_user_add, put_repository_action, restore_snapshot_action, create_snapshot_action, document_create_action_auto_id, document_index_action, document_create_action, search_action, bulk_action],DELETE:[rest_handler_security_user_delete, delete_repository_action, delete_index_action],PUT:[rest_handler_security_user_update, cluster_update_settings_action, put_repository_action, create_index_action, document_index_action, document_create_action]";

To open all APIs to connect OSS Kibana or other third-party applications, use the following wildcard approach:

"GET:[*],PUT:[*],POST:[*],HEAD:[*],DELETE:[*]"

See the following table for the default allowed APIs and their corresponding Elasticsearch documentation names:

Method	Allowed API syntax	API
GET	main_action	info
	cat_health_action	cat.health
	nodes_stats_action	nodes.stats
	get_snapshots_action	snapshot.get
	snapshot_status_action	snapshot.status
	recovery_action	indices.recovery
	cat_recovery_action	cat.recovery
	get_indices_action	indices.get
	search_action	search
	cluster_health_action	cluster.health
HEAD	main_action	ping
	get_indices_action	indices.exists
POST	rest_handler_security_user_add	security.users.create
	put_repository_action	snapshot.create_repository
	restore_snapshot_action	snapshot.restore
	create_snapshot_action	snapshot.create
	document_create_action_auto_id	index
	document_index_action	index
	document_create_action	create
	search_action	search
	bulk_action	bulk
DELETE	rest_handler_security_user_delete	security.users.delete
	delete_repository_action	snapshot.delete_repository
	delete_index_action	indices.delete
PUT	rest_handler_security_user_update	rest_handler_security_user_update
	cluster_update_settings_action	cluster.put_settings
	put_repository_action	snapshot.create_repository
	create_index_action	indices.create
	document_index_action	index
	document_create_action	create

Note: APIs that are deprecated are blocked by the allowlist and cannot be enabled.

Reference table for available APIs to add to the allowlist

Method	Allowed API syntax	API	Paths
DELETE	clear_scroll_action	clear_scroll	[/_search/scroll, /_search/scroll/{scroll_id}]
DELETE	clear_voting_config_exclusions_action	cluster.delete_voting_config_exclusions	[/_cluster/voting_config_exclusions]
DELETE	delete_component_template_action	cluster.delete_component_template	[/_component_template/{name}]
DELETE	delete_composable_index_template_action	indices.delete_index_template	[/_index_template/{name}]
DELETE	delete_index_action	indices.delete	[/, /{index}]
DELETE	delete_index_template_action	indices.delete_template	[/_template/{name}]
DELETE	delete_repository_action	snapshot.delete_repository	[/_snapshot/{repository}]
DELETE	delete_snapshot_action	snapshot.delete	[/_snapshot/{repository}/{snapshot}]
DELETE	delete_stored_script_action	delete_script	[/_scripts/{id}]
DELETE	document_delete_action	delete	[/{index}/_doc/{id}, /{index}/{type}/{id}]
DELETE	index_delete_aliases_action	indices.delete_alias	[/{index}/_alias/{name}]
DELETE	ingest_delete_pipeline_action	ingest.delete_pipeline	[/_ingest/pipeline/{id}]
DELETE	rest_handler_security_user_delete	security.users.delete	[/_security/users/{username}]
GET	_scripts_painless_execute	scripts_painless_execute	[/_scripts/painless/_execute]
GET	analyze_action	indices.analyze	[/_analyze, /{index}/_analyze]
GET	cat_action	cat.help	[/_cat]
GET	cat_alias_action	cat.aliases	[/_cat/aliases, /_cat/aliases/{alias}]
GET	cat_allocation_action	cat.allocation	[/_cat/allocation, /_cat/allocation/{nodes}]
GET	cat_count_action	cat.count	[/_cat/count, /_cat/count/{index}]
GET	cat_fielddata_action	cat.fielddata	[/_cat/fielddata, /_cat/fielddata/{fields}]
GET	cat_health_action	cat.health	[/_cat/health]
GET	cat_indices_action	cat.indices	[/_cat/indices, /_cat/indices/{index}]
GET	cat_master_action	cat.master	[/_cat/master]
GET	cat_node_attrs_action	cat.nodeattrs	[/_cat/nodeattrs]
GET	cat_nodes_action	cat.nodes	[/_cat/nodes]
GET	cat_pending_cluster_tasks_action	cat.pending_tasks	[/_cat/pending_tasks]
GET	cat_plugins_action	cat.plugins	[/_cat/plugins]
GET	cat_recovery_action	cat.recovery	[/_cat/recovery, /_cat/recovery/{index}]
GET	cat_repositories_action	cat.repositories	[/_cat/repositories]
GET	cat_segments_action	cat.segments	[/_cat/segments, /_cat/segments/{index}]
GET	cat_shards_action	cat.shards	[/_cat/shards, /_cat/shards/{index}]
GET	cat_snapshot_action	cat.snapshots	[/_cat/snapshots, /_cat/snapshots/{repository}]
GET	cat_tasks_action	cat.tasks	[/_cat/tasks]
GET	cat_templates_action	cat.templates	[/_cat/templates, /_cat/templates/{name}]
GET	cat_threadpool_action	cat.thread_pool	[/_cat/thread_pool, /_cat/thread_pool/{thread_pool_patterns}]
GET	cluster_allocation_explain_action	cluster.allocation_explain	[/_cluster/allocation/explain]
GET	cluster_get_settings_action	cluster.get_settings	[/_cluster/settings]
GET	cluster_health_action	cluster.health	[/_cluster/health, /_cluster/health/{index}]
GET	cluster_search_shards_action	search_shards	[/_search_shards, /{index}/_search_shards]
GET	cluster_state_action	cluster.state	[/_cluster/state, /_cluster/state/{metric}, /_cluster/state/{metric}/{indices}]
GET	cluster_stats_action	cluster.stats	[/_cluster/stats, /_cluster/stats/nodes/{nodeId}]
GET	count_action	count	[/_count, /{index}/_count, /{index}/{type}/_count]
GET	document_get_action	get	[/{index}/_doc/{id}, /{index}/{type}/{id}]
GET	document_get_source_action	get_source	[/{index}/_source/{id}, /{index}/{type}/{id}/_source]
GET	document_mget_action	mget	[/_mget, /{index}/_mget, /{index}/{type}/_mget]
GET	document_multi_term_vectors_action	mtermvectors	[/_mtermvectors,/{index}/_mtermvectors, /{index}/{type}/_mtermvectors]
GET	document_term_vectors_action	termvectors	[/{index}/_termvectors, /{index}/_termvectors/{id}, /{index}/{type}/_termvectors, /{index}/{type}/{id}/_termvectors]
GET	explain_action	explain	[/{index}/_explain/{id}, /{index}/{type}/{id}/_explain]
GET	field_capabilities_action	field_caps	[/_field_caps, /{index}/_field_caps]
GET	flush_action	indices.flush	[/_flush, /{index}/_flush]
GET	get_aliases_action	indices.get_alias	[/_alias /{index}/_alias]
GET	get_component_template_action	cluster.get_component_template	[/_component_template, /_component_template/{name}]
GET	get_composable_index_template_action	indices.get_index_template	[/_index_template, /_index_template/{name}]
GET	get_field_mapping_action	indices.get_field_mapping	[/_mapping/field/{fields}, /_mapping/{type}/field/{fields}, /{index}/_mapping/field/{fields}, /{index}/{type}/_mapping/field/{fields}, /{index}/_mapping/{type}/field/{fields}]
GET	get_index_template_action	indices.get_template	[/_template, /_template/{name}]
GET	get_indices_action	indices.get	[/{index}]
GET	get_mapping_action	indices.get_mapping	[/_mapping, /{index}/{type}/_mapping, /{index}/_mapping, /{index}/_mappings, /{index}/_mappings/{type}, /{index}/_mapping/{type}, /{index}/_mapping/{type}, /_mapping/{type}]
GET	get_repositories_action	snapshot.get_repository	[/_snapshot, /_snapshot/{repository}]
GET	get_settings_action	indices.get_settings	[/_settings, /_settings/{name}, /{index}/_settings, /{index}/_settings/{name}]
GET	get_snapshots_action	snapshot.get	[/_snapshot/{repository}/{snapshot}]
GET	get_stored_scripts_action	get_script	[/_scripts/{id}]
GET	get_task_action	tasks.get	[/_tasks/{task_id}]
GET	indices_segments_action	indices.segments	[/_segments, /{index}/_segments]
GET	indices_shard_stores_action	indices.shard_stores	[/_shard_stores, /{index}/_shard_stores]
GET	indices_stats_action	indices.stats	[/_stats, /_stats/{metric}, /{index}/_stats, /{index}/_stats/{metric}]
GET	ingest_get_pipeline_action	ingest.get_pipeline	[/_ingest/pipeline, /_ingest/pipeline/{id}]
GET	ingest_processor_grok_get	ingest.processor_grok	/_ingest/processor/grok]
GET	ingest_simulate_pipeline_action	ingest.simulate	[/_ingest/pipeline/{id}/_simulate, /_ingest/pipeline/{id}/_simulate, /_ingest/pipeline/_simulate]
GET	list_tasks_action	tasks.list	[/_tasks]
GET	main_action	info	[/]
GET	msearch_action	msearch	[/_msearch, /{index}/_msearch, /{index}/{type}/_msearch]
GET	multi_search_template_action	msearch_template	[/_msearch/template, /{index}/_msearch/template, /{index}/{type}/_msearch/template]
GET	nodes_hot_threads_action	nodes.hot_threads	[/_nodes/hot_threads, /_nodes/{nodeId}/hot_threads]
GET	nodes_info_action	nodes.info	[/_nodes, /_nodes/{nodeId}, /_nodes/{nodeId}/{metrics}, /_nodes/{nodeId}/info/{metrics}]
GET	nodes_stats_action	nodes.stats	[/_nodes/stats, /_nodes/{nodeId}/stats, /_nodes/stats/{metric}, /_nodes/{nodeId}/stats/{metric}, /_nodes/stats/{metric}/{index_metric}, /_nodes/{nodeId}/stats/{metric}/{index_metric}]
GET	nodes_usage_action	nodes.usage	[/_nodes/usage, /_nodes/{nodeId}/usage, /_nodes/usage/{metric}, /_nodes/{nodeId}/usage/{metric}]
GET	pending_cluster_tasks_action	cluster.pending_tasks	[/_cluster/pending_tasks]
GET	rank_eval_action	rank_eval	[/_rank_eval, /{index}/_rank_eval]
GET	recovery_action	indices.recovery	[/_recovery, /{index}/_recovery]
GET	refresh_action	indices.refresh	[/_refresh, /{index}/_refresh]
GET	remote_cluster_info_action	cluster.remote_info	_remote/info]
GET	render_search_template_action	render_search_template	[/_render/template, /_render/template/{id}]
GET	script_context_action	get_script_context	[/_script_context]
GET	script_language_action	get_script_languages	[/_script_language]
GET	search_action	search	[/_search, /{index}/_search, /{index}/{type}/_search]
GET	search_scroll_action	scroll	[/_search/scroll, /_search/scroll/{scroll_id}]
GET	search_template_action	search_template	[/_search/template, /{index}/_search/template, /{index}/{type}/_search/template]
GET	snapshot_status_action	snapshot.status	[/_snapshot/{repository}/{snapshot}/_status, /_snapshot/{repository}/_status, /_snapshot/_status]
GET	synced_flush_action	indices.flush_synced	[/_flush/synced, /{index}/_flush/synced]
GET	upgrade_status_action	indices.get_upgrade	[/_upgrade, /{index}/_upgrade]
GET	validate_query_action	indices.validate_query	[/_validate/query, /{index}/_validate/query, /{index}/{type}/_validate/query]
HEAD	document_get_action	exists	[/{index}/_doc/{id}, /{index}/{type}/{id}]
HEAD	document_get_source_action	exists_source	[/{index}/_source/{id}, /{index}/{type}/{id}/_source]
HEAD	get_aliases_action	indices.exists_alias	[/_alias /{index}/_alias]
HEAD	get_component_template_action	cluster.exists_component_template	[/_component_template, /_component_template/{name}]
HEAD	get_composable_index_template_action	indices.exists_index_template	[/_index_template, /_index_template/{name}]
HEAD	get_index_template_action	indices.exists_template	[/_template, /_template/{name}]
HEAD	get_indices_action	indices.exists	[/{index}]
HEAD	get_mapping_action	indices.exists_type	[/_mapping, /{index}/{type}/_mapping, /{index}/_mapping, /{index}/_mappings, /{index}/_mappings/{type}, /{index}/_mapping/{type}, /{index}/_mapping/{type}, /_mapping/{type}]
HEAD	main_action	ping	[/]
POST	_scripts_painless_execute	scripts_painless_execute	[/_scripts/painless/_execute]
POST	add_voting_config_exclusions_action	cluster.post_voting_config_exclusions	[/_cluster/voting_config_exclusions/{node_name}, /_cluster/voting_config_exclusions]
POST	analyze_action	indices.analyze	[/_analyze, /{index}/_analyze]
POST	bulk_action	bulk	[/_bulk, /{index}/_bulk, /{index}/{type}/_bulk]
POST	cancel_tasks_action	tasks.cancel	[/_tasks/_cancel, /_tasks/{task_id}/_cancel]
POST	cleanup_repository_action	snapshot.cleanup_repository	[/_snapshot/{repository}/_cleanup]
POST	clear_indices_cache_action	indices.clear_cache	[/_cache/clear, /{index}/_cache/clear]
POST	clone_index_action	indices.clone	[/{index}/_clone/{target}]
POST	close_index_action	indices.close	[/_close, /{index}/_close]
POST	cluster_allocation_explain_action	cluster.allocation_explain	[/_cluster/allocation/explain]
POST	cluster_reroute_action	cluster.reroute	[/_cluster/reroute]
POST	cluster_search_shards_action	search_shards	[/_search_shards, /{index}/_search_shards]
POST	count_action	count	[/_count, /{index}/_count, /{index}/{type}/_count]
POST	create_snapshot_action	snapshot.create	[/_snapshot/{repository}/{snapshot}]
POST	delete_by_query_action	delete_by_query	[/{index}/_delete_by_query, /{index}/{type}/_delete_by_query]
POST	document_create_action_auto_id	index	/{index}/_doc, /{index}/{type}]
POST	document_create_action	create	[/{index}/_create/{id}, /{index}/{type}/{id}/_create]
POST	document_index_action	index	[/{index}/_doc/{id}, /{index}/{type}/{id}]
POST	document_mget_action	mget	[/_mget, /{index}/_mget, /{index}/{type}/_mget]
POST	document_multi_term_vectors_action	mtermvectors	[/_mtermvectors,/{index}/_mtermvectors, /{index}/{type}/_mtermvectors]
POST	document_term_vectors_action	termvectors	[/{index}/_termvectors, /{index}/_termvectors/{id}, /{index}/{type}/_termvectors, /{index}/{type}/{id}/_termvectors]
POST	document_update_action	update	[/{index}/_update/{id}, /{index}/{type}/{id}/_update]
POST	explain_action	explain	[/{index}/_explain/{id}, /{index}/{type}/{id}/_explain]
POST	field_capabilities_action	field_caps	[/_field_caps, /{index}/_field_caps]
POST	flush_action	indices.flush	[/_flush, /{index}/_flush]
POST	force_merge_action	indices.forcemerge	[/_forcemerge, /{index}/_forcemerge]
POST	index_put_alias_action	indices.put_alias	[/{index}/_alias/{name}, /_alias/{name}, /_aliases/{name}, /{index}/_alias, /{index}/_aliases, /_alias]
POST	indices_aliases_action	indices.update_aliases	[/_aliases]
POST	ingest_simulate_pipeline_action	ingest.simulate	[/_ingest/pipeline/{id}/_simulate, /_ingest/pipeline/{id}/_simulate, /_ingest/pipeline/_simulate]
POST	msearch_action	msearch	[/_msearch, /{index}/_msearch, /{index}/{type}/_msearch]
POST	multi_search_template_action	msearch_template	[/_msearch/template, /{index}/_msearch/template, /{index}/{type}/_msearch/template]
POST	nodes_reload_action	nodes.reload_secure_settings	[/_nodes/reload_secure_settings, /_nodes/{nodeId}/reload_secure_settings]
POST	open_index_action	indices.open	[/_open, /{index}/_open]
POST	put_component_template_action	cluster.put_component_template	[/_component_template/{name}]
POST	put_composable_index_template_action	indices.put_index_template	[/_index_template/{name}]
POST	put_index_template_action	indices.put_template	[/_template/{name}]
POST	put_mapping_action	indices.put_mapping	[/{index}/_mapping/, /{index}/{type}/_mapping, /{index}/_mapping/{type}, /_mapping/{type}, /{index}/_mappings/, /{index}/{type}/_mappings, /{index}/_mappings/{type}, /_mappings/{type}]
POST	put_repository_action	snapshot.create_repository	[/_snapshot/{repository}]
POST	put_stored_script_action	put_script	[/_scripts/{id}, /_scripts/{id}/{context}]
POST	rank_eval_action	rank_eval	[/_rank_eval, /{index}/_rank_eval]
POST	refresh_action	indices.refresh	[/_refresh, /{index}/_refresh]
POST	reindex_action	reindex	[/_reindex]
POST	render_search_template_action	render_search_template	[/_render/template, /_render/template/{id}]
POST	rest_handler_security_user_add	security.users.create	[/_security/users]
POST	restore_snapshot_action	snapshot.restore	[/_snapshot/{repository}/{snapshot}/_restore]
POST	rethrottle_action	delete_by_query_rethrottle	[/_update_by_query/{taskId}/_rethrottle, /_delete_by_query/{taskId}/_rethrottle, /_reindex/{taskId}/_rethrottle]
POST	rethrottle_action	reindex_rethrottle	[/_update_by_query/{taskId}/_rethrottle, /_delete_by_query/{taskId}/_rethrottle, /_reindex/{taskId}/_rethrottle]
POST	rethrottle_action	update_by_query_rethrottle	[/_update_by_query/{taskId}/_rethrottle, /_delete_by_query/{taskId}/_rethrottle, /_reindex/{taskId}/_rethrottle]
POST	rollover_index_action	indices.rollover	[/{index}/_rollover, /{index}/_rollover/{new_index}]
POST	search_action	search	[/_search, /_search, /{index}/_search, /{index}/{type}/_search]
POST	search_scroll_action	scroll	[/_search/scroll, /_search/scroll/{scroll_id}]
POST	search_template_action	search_template	[/_search/template, /{index}/_search/template, /{index}/{type}/_search/template]
POST	shrink_index_action	indices.shrink	[/{index}/_shrink/{target}]
POST	simulate_index_template_action	indices.simulate_index_template	[/_index_template/_simulate_index/{name}]
POST	split_index_action	indices.split	[/{index}/_split/{target}]
POST	synced_flush_action	indices.flush_synced	[/_flush/synced, /{index}/_flush/synced]
POST	update_by_query_action	update_by_query	[/{index}/_update_by_query, /{index}/{type}/_update_by_query]
POST	upgrade_action	indices.upgrade	[/_upgrade, /{index}/_upgrade]
POST	validate_query_action	indices.validate_query	[/_validate/query, /{index}/_validate/query, /{index}/{type}/_validate/query]
POST	verify_repository_action	snapshot.verify_repository	[/_snapshot/{repository}/_verify]
PUT	bulk_action	bulk	[/_bulk, /{index}/_bulk, /{index}/{type}/_bulk]
PUT	clone_index_action	indices.clone	[/{index}/_clone/{target}, /{index}/_clone/{target}]
PUT	cluster_update_settings_action	cluster.put_settings	[/_cluster/settings]
PUT	create_index_action	indices.create	[/{index}]
PUT	create_snapshot_action	snapshot.create	[/_snapshot/{repository}/{snapshot}]
PUT	document_create_action	create	[/{index}/_create/{id}, /{index}/{type}/{id}/_create]
PUT	document_index_action	index	[/{index}/_doc/{id}, /{index}/{type}/{id}]
PUT	index_put_alias_action	indices.put_alias	[/{index}/_alias/{name}, /_alias/{name}, /_aliases/{name}, /{index}/_alias, /{index}/_aliases, /_alias]
PUT	ingest_put_pipeline_action	ingest.put_pipeline	[/_ingest/pipeline/{id}]
PUT	put_component_template_action	cluster.put_component_template	[/_component_template/{name}]
PUT	put_composable_index_template_action	indices.put_index_template	[/_index_template/{name}]
PUT	put_index_template_action	indices.put_template	[/_template/{name}]
PUT	put_mapping_action	indices.put_mapping	[/{index}/_mapping/, /{index}/{type}/_mapping, /{index}/_mapping/{type}, /_mapping/{type}, /{index}/_mappings/, /{index}/{type}/_mappings, /{index}/_mappings/{type}, /_mappings/{type}]
PUT	put_repository_action	snapshot.create_repository	[/_snapshot/{repository}]
PUT	put_stored_script_action	put_script	[/_scripts/{id}, /_scripts/{id}/{context}]
PUT	rest_handler_security_user_update	security.users.modify	[/_security/users/{username}]
PUT	shrink_index_action	indices.shrink	[/{index}/_shrink/{target}]
PUT	split_index_action	indices.split	[/{index}/_split/{target}]
PUT	update_settings_action	indices.put_settings	[/{index}/_settings, /_settings]

Backup and restore

Take a snapshot of a running Elasticsearch cluster to back it up.

Snapshot storage requires a ReadWriteMany (RWX) access mode that specifies that the volume can be mounted as read/write by many nodes.

To use the Elasticsearch cluster snapshot storage, declare its use in the elasticsearch section of the AutomationBase custom resource by defining a snapshotStores element as shown in the following example. This example creates a 10 GB volume that is mounted at /usr/share/elasticsearch/snapshots/main by using the csi-cephfs StorageClass:

spec:
  elasticsearch:
    snapshotStores:
      - name: main
        storage:
          class: "csi-cephfs"
          size: "10Gi"

See the following example of the spec.elasticsearch.snapshotStores.storage that uses a volumeClaimTemplate:

spec:
  elasticsearch:
    license:
      accept: true
    version: "v1.0"
    snapshotStores:
      - name: main
        storage:
          volumeClaimTemplate:
            spec:
              accessModes:
                - ReadWriteMany
              volumeMode: Filesystem
              resources:
                requests:
                  storage: 50Gi
              storageClassName: "myStorageClass"
              selector:
                matchLabels:
                  es-storage-type: data

Note: Only the spec section of PersistentVolumeClaim provided for the volumeClaimTemplate field is used when creating the PersistentVolumeClaim resource. Any metadata entries provided under the volumeClaimTemplate is not used.

The definition of a snapshot store allocates storage for a snapshot repository that is mounted at a path by using the provided name in the following form: /usr/share/elasticsearch/snapshots/<name>.
Define one or more snapshot stores within the snapshotStores array. This definition does not create the Elasticsearch snapshot repositories but only the volumes for it.

When you use a shared file system to store snapshots, add the file system path or parent directory to the path.repo setting in the elasticsearch.yml file for each main and data node. Define these values in the Elasticsearch custom resource by defining a config element as shown in the following example:

spec:
  elasticsearch:
    snapshotStores:
      - name: main
        storage:
          class: "csi-cephfs"
          size: "10Gi"
    nodegroupspecs:
      - name: data
        replicas: 3
        config:
        - key: path.repo
          value: "/usr/share/elasticsearch/snapshots/main"

If you configure more than one snapshot store, enter the file system paths with a single path.repo config element as shown in the following example:

spec:
  elasticsearch:
    nodegroupspecs:
      - name: data
        replicas: 3
        config:
        - key: path.repo
          value: '["/usr/share/elasticsearch/snapshots/main", "/usr/share/elasticsearch/snapshots/temporary"]'

Then, create the snapshot repositories with the Elasticsearch REST API. The REST API references the mount path of the storage.

For more information, see Snapshot and restore.

Specify the following child elements as part of the snapshotStores array:

Element	Description
name	Name of the snapshot store, which is also used in the mount path
storage	A storage object

The child elements that you can specify as part of the storage object are the same elements that you can define in the Storage section.

Monitoring

By default, monitoring capabilities for Elasticsearch are disabled. To enable them, deploy your Elasticsearch custom resource as shown in the following example:

spec:
  elasticsearch:
    monitoring: {}

This feature deploys resources necessary to export the Elasticsearch metrics from the running instance to a Prometheus-friendly format and exposes an endpoint with these metrics. It also deploys a resource to gather these metrics into OpenShift Monitoring.

To enable the necessary monitoring in OpenShift as of version 4.6, follow the instructions in Enabling monitoring for user-defined projects to set up the correct ConfigMap configurations.

After the monitoring is enabled, go to Monitoring>Metrics to view the extracted metrics.

The following metrics for Elasticsearch can be queried.

Name	Type	Cardinality	Help
elasticsearch_breakers_estimated_size_bytes	gauge	4	Estimated size in bytes of breaker
elasticsearch_breakers_limit_size_bytes	gauge	4	Limit size in bytes for breaker
elasticsearch_breakers_tripped	counter	4	tripped for breaker
elasticsearch_cluster_health_active_primary_shards	gauge	1	The number of primary shards in your cluster. This is an aggregate total across all indices.
elasticsearch_cluster_health_active_shards	gauge	1	Aggregate total of all shards across all indices, which includes replica shards.
elasticsearch_cluster_health_delayed_unassigned_shards	gauge	1	Shards delayed to reduce reallocation overhead
elasticsearch_cluster_health_initializing_shards	gauge	1	Count of shards that are being freshly created.
elasticsearch_cluster_health_number_of_data_nodes	gauge	1	Number of data nodes in the cluster.
elasticsearch_cluster_health_number_of_in_flight_fetch	gauge	1	The number of ongoing shard info requests.
elasticsearch_cluster_health_number_of_nodes	gauge	1	Number of nodes in the cluster.
elasticsearch_cluster_health_number_of_pending_tasks	gauge	1	Cluster level changes, which have not yet been run
elasticsearch_cluster_health_task_max_waiting_in_queue_millis	gauge	1	Max time in millis that a task is waiting in queue.
elasticsearch_cluster_health_relocating_shards	gauge	1	The number of shards that are currently moving from one node to another node.
elasticsearch_cluster_health_status	gauge	3	Whether all primary and replica shards are allocated.
elasticsearch_cluster_health_timed_out	gauge	1	Number of cluster health checks timed out
elasticsearch_cluster_health_unassigned_shards	gauge	1	The number of shards that exist in the cluster state, but cannot be found in the cluster itself.
elasticsearch_filesystem_data_available_bytes	gauge	1	Available space on block device in bytes
elasticsearch_filesystem_data_free_bytes	gauge	1	Free space on block device in bytes
elasticsearch_filesystem_data_size_bytes	gauge	1	Size of block device in bytes
elasticsearch_filesystem_io_stats_device_operations_count	gauge	1	Count of disk operations
elasticsearch_filesystem_io_stats_device_read_operations_count	gauge	1	Count of disk read operations
elasticsearch_filesystem_io_stats_device_write_operations_count	gauge	1	Count of disk write operations
elasticsearch_filesystem_io_stats_device_read_size_kilobytes_sum	gauge	1	Total kilobytes read from disk
elasticsearch_filesystem_io_stats_device_write_size_kilobytes_sum	gauge	1	Total kilobytes written to disk
elasticsearch_indices_docs	gauge	1	Count of documents on this node
elasticsearch_indices_docs_deleted	gauge	1	Count of deleted documents on this node
elasticsearch_indices_docs_primary	gauge		Count of documents with only primary shards on all nodes
elasticsearch_indices_fielddata_evictions	counter	1	Evictions from field data
elasticsearch_indices_fielddata_memory_size_bytes	gauge	1	Field data cache memory usage in bytes
elasticsearch_indices_filter_cache_evictions	counter	1	Evictions from filter cache
elasticsearch_indices_filter_cache_memory_size_bytes	gauge	1	Filter cache memory usage in bytes
elasticsearch_indices_flush_time_seconds	counter	1	Cumulative flush time in seconds
elasticsearch_indices_flush_total	counter	1	Total flushes
elasticsearch_indices_get_exists_time_seconds	counter	1	Total time get exists in seconds
elasticsearch_indices_get_exists_total	counter	1	Total get exists operations
elasticsearch_indices_get_missing_time_seconds	counter	1	Total time of get missing in seconds
elasticsearch_indices_get_missing_total	counter	1	Total get missing
elasticsearch_indices_get_time_seconds	counter	1	Total get time in seconds
elasticsearch_indices_get_total	counter	1	Total get
elasticsearch_indices_indexing_delete_time_seconds_total	counter	1	Total time indexing delete in seconds
elasticsearch_indices_indexing_delete_total	counter	1	Total indexing deletes
elasticsearch_indices_indexing_index_time_seconds_total	counter	1	Cumulative index time in seconds
elasticsearch_indices_indexing_index_total	counter	1	Total index calls
elasticsearch_indices_merges_docs_total	counter	1	Cumulative docs merged
elasticsearch_indices_merges_total	counter	1	Total merges
elasticsearch_indices_merges_total_size_bytes_total	counter	1	Total merge size in bytes
elasticsearch_indices_merges_total_time_seconds_total	counter	1	Total time spent merging in seconds
elasticsearch_indices_query_cache_cache_total	counter	1	Count of query cache
elasticsearch_indices_query_cache_cache_size	gauge	1	Size of query cache
elasticsearch_indices_query_cache_count	counter	2	Count of query cache hit/miss
elasticsearch_indices_query_cache_evictions	counter	1	Evictions from query cache
elasticsearch_indices_query_cache_memory_size_bytes	gauge	1	Query cache memory usage in bytes
elasticsearch_indices_query_cache_total	counter	1	Size of query cache total
elasticsearch_indices_refresh_time_seconds_total	counter	1	Total time spent refreshing in seconds
elasticsearch_indices_refresh_total	counter	1	Total refreshes
elasticsearch_indices_request_cache_count	counter	2	Count of request cache hit/miss
elasticsearch_indices_request_cache_evictions	counter	1	Evictions from request cache
elasticsearch_indices_request_cache_memory_size_bytes	gauge	1	Request cache memory usage in bytes
elasticsearch_indices_search_fetch_time_seconds	counter	1	Total search fetch time in seconds
elasticsearch_indices_search_fetch_total	counter	1	Total number of fetches
elasticsearch_indices_search_query_time_seconds	counter	1	Total search query time in seconds
elasticsearch_indices_search_query_total	counter	1	Total number of queries
elasticsearch_indices_segments_count	gauge	1	Count of index segments on this node
elasticsearch_indices_segments_memory_bytes	gauge	1	Current memory size of segments in bytes
elasticsearch_indices_settings_stats_read_only_indices	gauge	1	Count of indices that have read_only_allow_delete=true
elasticsearch_indices_shards_docs	gauge	3	Count of documents on this shard
elasticsearch_indices_shards_docs_deleted	gauge	3	Count of deleted documents on each shard
elasticsearch_indices_store_size_bytes	gauge	1	Current size of stored index data in bytes
elasticsearch_indices_store_size_bytes_primary	gauge		Current size of stored index data in bytes with only primary shards on all nodes
elasticsearch_indices_store_size_bytes_total	gauge		Current size of stored index data in bytes with all shards on all nodes
elasticsearch_indices_store_throttle_time_seconds_total	counter	1	Throttle time for index store in seconds
elasticsearch_indices_translog_operations	counter	1	Total translog operations
elasticsearch_indices_translog_size_in_bytes	counter	1	Total translog size in bytes
elasticsearch_indices_warmer_time_seconds_total	counter	1	Total warmer time in seconds
elasticsearch_indices_warmer_total	counter	1	Total warmer count
elasticsearch_jvm_gc_collection_seconds_count	counter	2	Count of JVM GC runs
elasticsearch_jvm_gc_collection_seconds_sum	counter	2	GC run time in seconds
elasticsearch_jvm_memory_committed_bytes	gauge	2	JVM memory currently committed by area
elasticsearch_jvm_memory_max_bytes	gauge	1	JVM memory max
elasticsearch_jvm_memory_used_bytes	gauge	2	JVM memory currently used by area
elasticsearch_jvm_memory_pool_used_bytes	gauge	3	JVM memory currently used by pool
elasticsearch_jvm_memory_pool_max_bytes	counter	3	JVM memory max by pool
elasticsearch_jvm_memory_pool_peak_used_bytes	counter	3	JVM memory peak used by pool
elasticsearch_jvm_memory_pool_peak_max_bytes	counter	3	JVM memory peak max by pool
elasticsearch_os_cpu_percent	gauge	1	Percent CPU used by the OS
elasticsearch_os_load1	gauge	1	Short term load average
elasticsearch_os_load5	gauge	1	Midterm load average
elasticsearch_os_load15	gauge	1	Long term load average
elasticsearch_process_cpu_percent	gauge	1	Percent CPU used by process
elasticsearch_process_cpu_time_seconds_sum	counter	3	Process CPU time in seconds
elasticsearch_process_mem_resident_size_bytes	gauge	1	Resident memory in use by process in bytes
elasticsearch_process_mem_share_size_bytes	gauge	1	Shared memory in use by process in bytes
elasticsearch_process_mem_virtual_size_bytes	gauge	1	Total virtual memory used in bytes
elasticsearch_process_open_files_count	gauge	1	Open file descriptors
elasticsearch_snapshot_stats_number_of_snapshots	gauge	1	Total number of snapshots
elasticsearch_snapshot_stats_oldest_snapshot_timestamp	gauge	1	Oldest snapshot timestamp
elasticsearch_snapshot_stats_snapshot_start_time_timestamp	gauge	1	Last snapshot start timestamp
elasticsearch_snapshot_stats_snapshot_end_time_timestamp	gauge	1	Last snapshot end timestamp
elasticsearch_snapshot_stats_snapshot_number_of_failures	gauge	1	Last snapshot number of failures
elasticsearch_snapshot_stats_snapshot_number_of_indices	gauge	1	Last snapshot number of indices
elasticsearch_snapshot_stats_snapshot_failed_shards	gauge	1	Last snapshot failed shards
elasticsearch_snapshot_stats_snapshot_successful_shards	gauge	1	Last snapshot successful shards
elasticsearch_snapshot_stats_snapshot_total_shards	gauge	1	Last snapshot total shard
elasticsearch_thread_pool_active_count	gauge	14	Thread Pool threads active
elasticsearch_thread_pool_completed_count	counter	14	Thread Pool operations completed
elasticsearch_thread_pool_largest_count	gauge	14	Thread Pool largest threads count
elasticsearch_thread_pool_queue_count	gauge	14	Thread Pool operations queued
elasticsearch_thread_pool_rejected_count	counter	14	Thread Pool operations rejected
elasticsearch_thread_pool_threads_count	gauge	14	Thread Pool current threads count
elasticsearch_transport_rx_packets_total	counter	1	Count of packets received
elasticsearch_transport_rx_size_bytes_total	counter	1	Total number of bytes received
elasticsearch_transport_tx_packets_total	counter	1	Count of packets sent
elasticsearch_transport_tx_size_bytes_total	counter	1	Total number of bytes sent
elasticsearch_clusterinfo_last_retrieval_success_ts	gauge	1	Timestamp of the last successful cluster info retrieval
elasticsearch_clusterinfo_up	gauge	1	Up metric for the cluster info collector
elasticsearch_clusterinfo_version_info	gauge	6	Constant metric with ES version information as labels

Audit logging

Enable the custom security plug-in that is provided with the Elasticsearch cluster in IBM Automation foundation to log authorization audit records in Cloud Auditing Data Format (CADF).

The authorization audit logging is disabled by default. To enable it, send a REST request to the Elasticsearch system.

curl -X PUT -u "<USERNAME>:<PASSWORD>" "http://<you_elasticsearch_host>/_cluster/settings" \
-H 'Content-Type: application/json' \
-d'{
  "transient": {
    "logger.com.ibm.elasticsearch.audit": "AUDIT"
  }
}'

To turn off the logging for authorization, send a REST request to the Elasticsearch system.

curl -X PUT -u "<USERNAME>:<PASSWORD>" "http://<you_elasticsearch_host>/_cluster/settings" \
-H 'Content-Type: application/json' \
-d'{
  "transient": {
    "logger.com.ibm.elasticsearch.audit": "OFF"
  }
}'

The audit logs for each container are written to the base logs directory in the <cluster_name>_audit.json file. For example, if the Elasticsearch instance is called elasticsearch-sample, the log file is /usr/share/elasticsearch/storage/logs/elasticsearch-sample-elasticsearch-cluster_audit.json.

See the following example of a CADF authorization audit log message:

{
    "outcome": "success",
    "typeURI": "http://schemas.dmtf.org/cloud/audit/1.0/event",
    "eventType": "activity",
    "eventTime": "2021-01-12T12:33:21.409062Z",
    "action": "authenticate",
    "requestPath": "/customer/_doc/1",
    "id": "elasticsearch:ccb7c058-0c32-443c-a6ee-acab813ad978",
    "severity": "normal",
    "initiator": {
        "id": "elasticsearch:d367b08e-af48-456f-9f3b-4e325c85a098",
        "name": "elasticsearch-admin",
        "typeURI": "service/security/account/user",
        "host": {
            "agent": "PostmanRuntime/7.26.8",
            "address": "/127.0.0.1"
        },
        "credential": {
            "type": "user"
        }
    },
    "target": {
        "id": "elasticsearch:elasticsearch-sample-elasticsearch-es-master-data-0",
        "name": "elasticsearch-sample-elasticsearch-es-master-data-0",
        "typeURI": "service/security/account/user"
    },
    "observer": {
        "name": "ElasticSearchSecurityPlugin",
        "id": "userActivity",
        "typeURI": "service/security/elasticsearch"
    },
    "reason": {
        "reasonCode": 200,
        "reasonType": "OK"
    }
}

JVM options

The JVM that is being used within the Elasticsearch container is set up using the default Elasticsearch JVM settings. By default, Elasticsearch tells the JVM to use a heap with a minimum and maximum size of 1 GB. When you move to production, it is important to configure heap size to ensure that Elasticsearch has enough heap available. The JVM options can be specified by providing an ES_JAVA_OPTS environment variable for each of the containers.

The following example demonstrates the use of AutomationBase custom resource to set the ES_JAVA_OPTS environment variable, which specifies the Java minimum and maximum heap size for the master-data containers.

spec:
  elasticsearch:
    nodegroupspecs:
      - name: master-data
        replicas: 3
        storage: {}
        template:
          pod:
            spec:
              containers:
                - env:
                    - name: ES_JAVA_OPTS
                      value: '-Xms2g -Xmx2g'
                  name: elasticsearch
                  resources:
                    limits:
                      cpu: 1100m
                      memory: 5Gi
                    requests:
                      cpu: 900m
                      memory: 3Gi

Note: When you set the Xms (minimum heap size) and Xmx (maximum heap size) settings, you must set these to be equal to each other. Also, you must set Xmx and Xms to no more than 50% of your physical RAM.

For more information, see Setting the heap size External link icon documentation.

Elasticsearch on multi-zone clusters

The Elasticsearch included in IBM Automation Foundation works fine on multi-zone clusters. Although it works fine and provides resilience against whole-zone failures, it may not keep up optimal performance. The performance depends on low latency and high bandwidth connection between data centres hosting the nodes.

For more details, please refer the link here.

The cross cluster replication feature, required to implement the performance optimised multi-zone deployment is not available in the current version of Elasticsearch.

Support for the latest Elasticsearch version

IBM Automation Foundation v1.3.0 with Elasticsearch Operator version 1.3.0 has included a new Operand version for ElasticSearch CRs of 2.0.0.

The use of the 2.0.0 operand version (and the corresponding v2 operand channel) will utilise the latest Elasticsearch (ELv2 v17.15.1). The v2 operand needs to be set on the elasticsearch element in the Automationbase CR. For example:

apiVersion: base.automation.ibm.com/v1beta1
kind: AutomationBase
metadata:
  name: iaf-automationbase-instance
  namespace: acme-iaf
spec:
  elasticsearch: 
    license: 
      accept: true
    version: v1.0
    monitoring: {}
    nodegroupspecs:
      - name: master-data
        replicas: 3  
    tls: {}
    version: v2
  kafka: {}
  license:
    accept: true
  tls: {}
  version: v1

If the v1 channel is used, Elasticsearch v7.8.0 will continue to be used.

Note: Once the v2 operand is used, it is not possible to easily move to v1. This is due to the resources created by Elasticsearch 7.15.1 is not being valid on 7.8.0. While upgrading from v1 to v2 operator channel, it is highly recommended to make a backup of the Elasicsearch database prior to modifying the operand to use v2.