Creating Elasticsearch snapshot to backup and restore data

Elasticsearch provides a snapshot and restore mechanism to backup indices and clusters. Snapshots can be stored in either cloud-based object storage services such as AWS S3, Azure Blob Storage, IBM Cloud Object Storage, Google Cloud Storage or a local or shared file system. You can choose the storage option that best suits your infrastructure and operational requirements.

Before you begin

  1. Setup a backup in the primary cluster. The backup can be stored in a different cloud storage or filesystem.
  2. Ensure that the snapshot repository that is created for backup must be registered in both primary and secondary clusters.
  3. The storage which is the cloud or filesystem that is used in the repository must be accessible from both clusters.
  4. The versions of Elasticsearch must be the same in both clusters.
  5. The secondary cluster must have enough storage available to accommodate the restored data.

About this task

The following procedure demonstrate how to configure snapshot repositories and perform backup and restore operations.

Procedure

  1. Create a snapshot repository.
    A snapshot repository is an off-cluster storage location for your snapshots. You must register a repository before you can take or restore snapshots. For more information, see Register a snapshot repository.
  2. Create a snapshot backup in the primary cluster.
    A snapshot backup is a point-in-time copy of indices and cluster states that are stored in a remote repository for backup and restore purposes. For more information, see Create a snapshot.
    Important: While creating a snapshot in the primary Elasticsearch cluster, ensure that all the product data and configurations are included to enable a complete and reliable backup, including the following data:
    • All the application generated indices such as iv-*, siprules_*, cat*, demands-*, reservations-*, supplies-*, item-node-records, otmz-*, and any relevant .ds-* or hidden indices required by the product.
    • Data streams associated with your product.
    • Index settings and mappings.
    • Index Lifecycle Management (ILM) policies.
    • Index aliases.
    • Index templates and component templates.
    • Ingest pipelines, if used.
    • Snapshot Lifecycle Management (SLM) policies, if applicable.
    • Cluster settings that are configured by setting include_global_state: true.
  3. Restore the snapshot in the secondary cluster.
    Restoring a snapshot to a secondary cluster involves
    • retrieving data from a snapshot that is created in the primary cluster which is stored in a shared or transferred repository and
    • loading it into the secondary cluster to recover or migrate indices and cluster state.
    For more information, see Restore a snapshot.
    Important: While creating a snapshot in the primary Elasticsearch cluster, ensure that all the product data and configurations are included to enable a complete and reliable backup, including the following data:
    • All the application generated indices such as iv-*, siprules_*, cat*, demands-*, reservations-*, supplies-*, item-node-records, otmz-*, and any relevant .ds-* or hidden indices required by the product.
    • Data streams associated with your product.
    • Index settings and mappings.
    • Index Lifecycle Management (ILM) policies.
    • Index aliases.
    • Index templates and component templates.
    • Ingest pipelines, if used.
    • Snapshot Lifecycle Management (SLM) policies, if applicable.
    • Cluster settings that are configured by setting include_global_state: true.