Performance considerations

Edit online

This section gives a brief overview about best practices to tune the performance of IBM Fusion Data Foundation. It assumes that your cluster meets the minimal requirements of IBM Fusion Data Foundation. The performance characteristics of IBM Fusion Data Foundation depend on the operating system, cpu, network, memory, and disk configuration.

Hardware layer

This section covers performance considerations and tuning tips for the hardware layer.

IFL Configuration: Clusters operating IBM Fusion Data Foundation should spend at least 1 IFL to operate IBM Fusion Data Foundation. More IFLs increase the performance for CPU-intensive workloads, for example, with small block sizes. This can be less efficient for workloads with large block sizes and read operations because they might be bottle-necked by network capacity.
Storage Guests: Specify at least the recommended amount of CPU, memory (for example, frequency, size, and so on) for nodes that operate IBM Fusion Data Foundation. The performance of the cluster can be impacted if a disk is too slow or broken. It is recommended to use SSD for journaling.
Network Configuration: If the network configuration allows it, it is beneficial to enable jumbo MTU sizes. For example, MTU 9200.

For details refer to the Red Hat Ceph Storage Strategies Guide and the Red Hat Ceph Storage Hardware Guide(Red Hat documentation)

Operating system

This section introduces tuning tips for the operating system (OS) running IBM Fusion Data Foundation. Most changes to the OS can be made via the Machine Config Operator of RHOCP.

Receive Package Steering: Receive Package Steering (RPS) directs packets to specific CPUs for processing. It aims to increase the CPU cache hit rate and thereby reduce network latency. Enabling RPS on the operating system on the storage nodes improves their network throughput and latency. This affects the Ceph performance noticeably, especially for workloads with large block sizes, which tend to be network-bounded. Consider that the increase in network performance comes at the expense of CPU demand, which might affect CPU-intensive workloads.
Transparent Huge Pages: Transparent Huge Pages (THP) allows the operating system to use large memory pages, typically 2 Mib. It aims to improve the performance by reducing the number of memory page table entries, memory fragmentation, and memory overhead. However, it can cause memory allocation delays during runtime and is not beneficial for all workloads, especially databases. Therefore, it can be beneficial to disable THP for OpenShift Container Platform Nodes to improve the performance and reduce CPU cycles.

For details see: Disable Transparent Huge Pages(Red Hat documentation)

Red Hat OpenShift layer

This section introduces tuning tips for the nodes running IBM Fusion Data Foundation.

Node Labeling: RHOCP enables the labelling of nodes as infrastructure to host infrastructure services on them. The infrastructure services include Red Hat OpenShift monitoring, Prometheus, and Grafana, which are all CPU and memory intensive. Isolating these services from compute and storage nodes can prevent performance drops. It is recommended that you create at least 3 infrastructure nodes with 4 vCores each and 32 GB memory.

Configuration of IBM Fusion Data Foundation

This section lists general tuning tips for Ceph and IBM Fusion Data Foundation.

Ceph Recovery

In the case of a failing OSD, the Ceph cluster must rebalance the data after the OSD is online again. Ceph avoids cluster performance degradations by slowing down the rebalancing. You can speed up the process by changing the following parameters:

ceph tell 'osd.*' injectargs --osd-max-backfills=12 --osd-recovery-max-active 4
ceph tell 'osd.*' config set osd_recovery_sleep_hdd 0
ceph tell 'osd.*' config set osd_recovery_sleep_ssd 0

For details on how to speed up or slow down osd recovery, refer to: Throttling backfill and recovery (Red Hat article) and Handling a node failure (Red Hat documentation)

Ceph Replication Mechanism

Data availability in ceph comes with a price — the clusters available space, performance and efficiency is greatly affected by the object replication. Tuning it down to 2 improves the clusters performance and efficiency by over 30%. However, it greatly increases the risk of a loss of data on failures. An object replication of 2 can be beneficial for many workloads and can be considered in environments in which disk or node failures are rare.

Ceph commands to configure number of replicas for CephRBD

POOL="ocs-storagecluster-cephblockpool"
NUM_REPLICAS=2
ceph osd pool set$POOL size $NUM_REPLICAS --yes-i-really-mean-it

Configuration of Hypervisors

This section introduces KVM-specific tuning practices for IBM Fusion Data Foundation.

KVM Guest I/O-Threads: Increasing the number of I/O threads for a KVM guest can improve the performance by enabling the guest operating system to handle more I/O operations in parallel. An IBM Fusion Data Foundation storage node guest can operate one or more OSDs, which increase the I/O contention. This contention can be reduced by increasing the number of I/O threads. This allows a more efficient use of the attached storage devices. However, it is important to consider that the increasing number of I/O threads might also lead to increased CPU and memory usage

For details, refer to: Use I/O Threads for your virtual block devices(Red Hat documentation)

Performance benchmarking

The performance of IBM Fusion Data Foundation depends on disk, network, and compute resources. Therefore, it is important to gain insights into each resource. This section lists the performance benchmarks and general recommendations for IBM Fusion Data Foundation.

FIO (Flexible I/O Tester): FIO is one of the most well-known tools to benchmark storage backends. Ceph itself has diagnostic tools to categorize OSDs as SSD or HDD and to warn for slow performance, but it can also be helpful to obtain the performance characteristics of a storage disk via a benchmark tool. You can characterize the performance capability of a raw storage disk using sequential read and write and random read and write workloads. Because Ceph strips data across many OSDs, the access pattern on the raw device resembles a random workload. FIO also contains a Ceph driver, which allows the benchmarking of CephRBD or CephFS. Also consider multi FIO-server setups via the FIO client/server mechanism because this is a pattern, which is typical for an RHOCP cluster where multiple pods consume the storage.
Yahoo! Cloud Serving Benchmark (YCSB): YCSB is one of the most well-known NoSQL benchmark suites. It provides a framework to create benchmark workloads and collect performance data, for example transactions-per-second or latency. It supports many databases, for example MongoDB or Redis. The performance of databases that are backed by IBM Fusion Data Foundation with cloud workloads is enabled.
Uperf: Uperf is a network benchmark tool, which can be used to estimate the network capacity. IBM Fusion Data Foundation is network-intensive because data is striped across multiple OSDs on different storage nodes. Therefore, it is important to consider the network capability and configuration. With Uperf you can measure the performance in pod-to-pod scenarios on one compute node, between compute nodes, and across LPARs. Ensure that the network between OSDs is sufficient.