Workload considerations

One of the key benefits of a Ceph storage cluster is the ability to support different types of workloads within the same storage cluster using performance domains.

Performance domains allow IBM Storage Ceph clusters to support different workloads within the same cluster. Each domain uses hardware optimized for its performance profile. Storage administrators can deploy pools on the appropriate domain to meet application requirements. Selecting correctly sized and optimized servers for these domains is essential for cluster design.

To the Ceph client interface that reads and writes data, a Ceph storage cluster appears as a simple pool where the client stores data. However, the storage cluster performs many complex operations in a manner that is visible to the client interface. Ceph clients and Ceph object storage daemons, referred to as Ceph OSDs, or simply OSDs, both use the Controlled Replication Under Scalable Hashing (CRUSH) algorithm for the storage and retrieval of objects. Ceph OSDs can run in containers within the storage cluster.

A CRUSH map describes a topography of cluster resources, and the map exists both on client hosts as well as Ceph Monitor hosts within the cluster. Ceph clients and Ceph OSDs both use the CRUSH map and the CRUSH algorithm. Ceph clients communicate directly with OSDs, eliminating a centralized object lookup and a potential performance bottleneck. With awareness of the CRUSH map and communication with their peers, OSDs can handle replication, backfilling, and recovery—allowing for dynamic failure recovery.

Ceph uses the CRUSH map to implement failure domains. Ceph also uses the CRUSH map to implement performance domains, which take the performance profile of the underlying hardware into consideration. The CRUSH map describes how Ceph stores data, and it is implemented as a simple hierarchy, specifically an acyclic graph, and a ruleset. The CRUSH map can support multiple hierarchies to separate one type of hardware performance profile from another. Ceph implements performance domains with device "classes".

For example, you can have these performance domains coexisting in the same IBM Storage Ceph cluster:

  • Hard disk drives (HDDs) are typically appropriate for cost and capacity-focused workloads.
  • Throughput-sensitive workloads typically use HDDs with Ceph write journals on solid-state drives (SSDs).
  • IOPS-intensive workloads, such as MySQL and MariaDB, often use SSDs, including those with NVMe, SAS, or SATA interfaces.
Figure 1. Performance and Failure Domains
Performance and Failure Domains

Workloads

IBM Storage Ceph is optimized for three primary workloads:
Important: Carefully consider the workload being run by IBM Storage Ceph clusters BEFORE considering what hardware to purchase because it can significantly impact the price and performance of the storage cluster. For example, if the workload is capacity-optimized and the hardware is better suited to a throughput-optimized workload, then hardware will be more expensive than necessary. Conversely, if the workload is throughput-optimized and the hardware is better suited to a capacity-optimized workload, then the storage cluster can suffer from poor performance.
IOPS optimized
Input, output per second (IOPS) optimization deployments are suitable for cloud computing operations, such as running MYSQL or MariaDB instances as virtual machines on OpenStack. IOPS optimized deployments require higher performance storage such as 15k RPM SAS drives and separate SSD journals to handle frequent write operations. Some high IOPS scenarios use all flash storage to improve IOPS and total throughput.
An IOPS-optimized storage cluster has the following properties:
  • Lowest cost per IOPS.
  • Highest IOPS per GB.
  • 99th percentile latency consistency.
Uses for an IOPS-optimized storage cluster are:
  • Typically block storage.
  • 3x replication for hard disk drives (HDDs) or 2x replication for solid-state drives (SSDs).
  • MySQL on OpenStack clouds.
Throughput optimized
Throughput-optimized deployments are suitable for serving up significant amounts of data, such as graphic, audio, and video content. Throughput-optimized deployments require high-bandwidth networking hardware, controllers, and hard disk drives with fast sequential read and write characteristics. If fast data access is a requirement, then use a throughput-optimized storage strategy. Also, if fast write performance is a requirement, using SSDs for journals can substantially improve write performance.
A throughput-optimized storage cluster has the following properties:
  • Lowest cost per MBps (throughput).
  • Highest MBps per TB.
  • Highest MBps per BTU.
  • Highest MBps per Watt.
  • 97th percentile latency consistency.
Uses for a throughput-optimized storage cluster are:
  • Block or object storage.
  • 3x replication.
  • Active performance storage for video, audio, and images.
  • Streaming media, such as 4k video.
Capacity optimized
Capacity-optimized deployments are suitable for storing significant amounts of data as inexpensively as possible. Capacity-optimized deployments typically trade performance for a more attractive price point. For example, capacity-optimized deployments often use slower and less expensive SATA drives and colocate journals rather than using SSDs for journaling.
A cost and capacity-optimized storage cluster has the following properties:
  • Lowest cost per TB.
  • Lowest BTU per TB.
  • Lowest Watts required per TB.
A cost and capacity-optimized storage cluster has the following properties:
  • Typically object storage.
  • Erasure coding for maximizing usable capacity
  • Object archive.
  • Video, audio, and image object repositories.