Manually removing log indices

You can manually remove log indices.

Log data is stored on disk. Over time, unmanaged log data growth fills up your disk space. The following measures help to prevent this situation:

In addition to the automatic removal of log data, you can manually remove log indices:

Important: Log data is difficult to recover after it is deleted. Before you proceed, back up the data directory, as needed.

Removing log indices by using Elasticsearch API

The following steps require a functional Elasticsearch cluster.

  1. List all indices.

    • Log in to Kibana console and click Dev Tools.
    • From the navigation window, prepare the request to retrieve your indices:

      GET /_cat/indices?v
      
    • Click the submit request button to make the API call. You see a list of indices and the disk allocation, similar to the following example:

      health status index                             uuid                   pri rep docs.count docs.deleted store.size pri.store.size
      yellow open   logstash-2019.02.05               nbkLRGXqQ6enWMbLeYIO1w   5   1     932127            0    571.8mb        571.8mb
      
  2. Delete the indices.

    Note: Do not remove the searchguard and .kibana indices as they are essential to the logging function.

    • Identify the indices that you want to delete based on your list from Step 1.
    • From the navigation window, prepare the request to delete your index.

      DELETE /{your index name}
      
    • Click the submit request button to make the API call. Repeat the DELETE command delete more indices.

  3. Follow Step 1 to verify that you have available disk space. After deletion is complete, an index is re-created when new data is sent in for that index.

Removing log indices from disk

Important: All log data is removed.

  1. Scale down your product logging service.

    • Extract the Elasticsearch data pod replica count. Run the following command (for managed logging instances, use release name, logging):

      helm status <your_release_name> --tls
      
    • Locate the following information in the command output. Note the DESIRED replica count. In the following example, the DESIRED replica count is 1.

      ==> v1beta2/StatefulSet
      NAME              DESIRED  CURRENT  AGE
      logging-elk-data  1        1        24d
      
    • Scale down Elasticsearch data pod and Logstash. Scale both components to 0 replicas.

      • For managed mode, run the following command:

        kubectl scale statefulset logging-elk-data --replicas=0 -n kube-system
        kubectl scale deployment logging-elk-logstash --replicas=0 -n kube-system
        
      • For standard mode, run the following command after you replace the release name and the namespace:

        kubectl scale statefulset <release>-icplogging-data --replicas=0 -n kube-system
        kubectl scale deployment <release>-icplogging-logstash --replicas=0 -n kube-system
        
  2. Identify hosts and directories where you want to clear existing data.

    • List logging Persistent Volumes (PV) by using the following command:

      kubectl get pv|grep logging-elk-data
      

      Logging PVs can be identified by the name and the Persistent Volume Claims (PVCs) to which they bind. Take care if multiple logging service instances are installed.

    • Extract details for each logging PV by running the following command. Replace pv with your PV names:

      kubectl describe pv logging-datanode-9.42.80.204
      

      Your output resembles the following. Note your host and data directory as you need this information in a later step. In this example, the host is 9.42.80.204 and the data directory is /var/lib/icp/logging/elk-data):

      Name:              logging-datanode-9.42.80.204
      Labels:            <none>
      Annotations:       kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"name":"logging-datanode-9.42.80.204"},"spec":{"accessModes":["ReadWriteOnce"...
                         pv.kubernetes.io/bound-by-controller=yes
      Finalizers:        [kubernetes.io/pv-protection]
      StorageClass:      logging-storage-datanode
      Status:            Bound
      Claim:             kube-system/data-logging-elk-data-0
      Reclaim Policy:    Retain
      Access Modes:      RWO
      Capacity:          20Gi
      Node Affinity:
        Required Terms:
          Term 0:        kubernetes.io/hostname in [9.42.80.204]
      Message:
      Source:
          Type:  LocalVolume (a persistent volume backed by local storage on a node)
          Path:  /var/lib/icp/logging/elk-data
      Events:    <none>
      
  3. Clear the existing data.

    Note: Use the follow instructions to remove logging data stored on the host machine. If your logging instance uses other storage options, consult management tools for the specific storage type that you are using.

    Complete the following steps for each host that contains logging data:

    • SSH to the host that you noted in Step 2.
    • Remove data from the data directory that you noted in Step 2. For example, if you previously determined the uuid of an index that you want to remove to be uLNzMKeiR66oayCV7MpVmw, the actual directory that holds the data is similar to the following example:
      /var/lib/icp/logging/elk-data/nodes/0/indices/uLNzMKeiR66oayCV7MpVmw
      

To remove all data, delete /var/lib/icp/logging/elk-data/*. Do not delete the actual data directory.

  1. Scale back your product logging service. Scale up the Elasticsearch data pod and Logstash to wanted replicas that you noted in Step 1.

    • For managed mode, run the following command:

      kubectl scale statefulset logging-elk-data --replicas=<replica> -n kube-system
      kubectl scale deployment logging-elk-logstash --replicas=<replica> -n kube-system
      
    • For standard mode, run the following command after you replace the release name and the namespace:

      kubectl scale statefulset <release>-icplogging-data --replicas=0 -n kube-system
      kubectl scale deployment <release>-icplogging-logstash --replicas=0 -n kube-system
      
  2. The logging service becomes available in approximately 5 - 10 minutes.