Hardware requirements and recommendations

Note: The default memory allocation for the managed ELK stack is not intended for production use. Actual production usage might be much higher. The default values provide a starting point for prototyping and other demonstration efforts.

Storage

The minimum required disk size generally correlates to the amount of raw log data generated for a full log retention period. It is also a good practice to account for unexpected bursts of log traffic. As such, consider allocating an extra 25-50% of storage. If you do not know how much log data is generated, a good starting point is to allocate 100Gi of storage for each management node.

Avoid NAS storage as you might experience issues around stability and latency. Sometimes, the logging system does not start. This situation can introduce a single point of failure. For more information, see Disks External link icon.

You can modify the default storage size by adding the following block to the config.yaml file:

elasticsearch_storage_size: <new_size>

Memory

The number of pods and the amount of memory that is required by each pod differs depending on the volume of logs to be retained. Proper capacity planning is an iterative process that involves estimation and testing. You can start with the following guidelines:

Insufficient memory can lead to excess garbage collection, which can add significant CPU consumption by the Elasticsearch process.

The default memory allocation settings for the managed ELK stack can be modified by adding and customizing the following lines in config.yaml. In general, heapSize value equals approximately half of the overall pod memoryLimit value.

Note:

logging:
  logstash:
    heapSize: "512m"
    memoryLimit: "1024Mi"
  elasticsearch:
    client:
      heapSize: "1024m"
      memoryLimit: "1536Mi"
    data:
      heapSize: "1536m"
      memoryLimit: "3072Mi"
    master:
      heapSize: "1024m"
      memoryLimit: "1536Mi"
  kibana
    maxOldSpaceSize: "1536"
    memoryLimit: "2048Mi"

CPU

CPU usage can fluctuate depending on various factors. Long or complex queries tend to require the most CPU. Plan ahead to ensure that you have the capacity that is needed to handle all of the queries that your organization needs.