Storage considerations

To run stateful applications, developers need to store the persistent data in managed storage that is backed by some physical storage. Persistent volumes allow a state to persist across pods.

When you investigate the storage options for a deployment, consider everything in the whole, including the network, licenses, replication, and cost.

Capability storage requirements

Business Automation Insights require persistent volumes to be provided as either Read Write Once (RWO) or Read Write Many (RWX) persistence. The persistence type depends upon the type of workload that they support.

Since Business Automation Insights uses block storage , does not require the storage provider to perform cross-Availability Zone (AZ) replication when they run in a multi-replica mode because they include their own built-in replication.

Block storage is suitable for databases and other data storage systems that require high-performance persistence like OpenSearch. RWO persistent volumes are often provisioned for databases and indexed workloads. Components that depend on RWX file storage rely on the storage provider to replicate across AZs. When possible, S3 storage is the preferred persistence for read-write-many scenarios as it is easily replicable by the storage provider.

The following tables provide details on different aspects of the storage requirements: Table 2 - Storage space requirements and Table 4 - Storage example of Azure.

Table 1. Storage type requirements
Component RWO single-zone

(For example, EBS)

RWX

(replicated File)

S3

(replicated object storage)

Business Automation Insights * Yes (Kafka ** and OpenSearch **) Yes (Flink)  
Note:

* Component requires both RWO and RWX for different purposes.

** Native HA is recommended over multi-instance due to storage simplicity.

The following table lists the storage type and disk space requirements for production deployments. Ranges are for small to large environments.

The following table also provides storage requirements for production deployments. Kubernetes access modes External link opens a new window or tab include Read Write Once (RWO), Read Write Many (RWX), and Read Only Many (ROX).

Table 2. Storage space requirements
Capability or runtime Storage type Disk space Access mode Number of persistent volumes for non-HA/HA Posix compliance
Business Automation Insights File

Block (mandatory for OpenSearch)

Flink: 20 GB
  • Flink using OpenSearch snapshot storage: 30 GB
  • Flink using OpenSearch data storage: 10 GB

Sizing depends on the size of the projects.

  • Kafka: 2 GB x Event rate x Event size x retention duration x 1.10 x replica
  • Flink: 10 GB + 2 x Event rate x event size x average duration of event x replica
  • OpenSearch: Event rate x Event size x retention duration x (replicas +1)
    Posix compliance not needed

Make sure that the databases that you create satisfy your intended workload. For deployments that need to operate continuously with no interruptions in service, set up a high availability (HA) database.

The following table shows an example of Business Automation Insights storage in the public cloud on AWS.

Table 3. Storage example of AWS
Component AWS EBS

(Single-zone block RWO)

AWS EFS

(replicated File RWX)

AWS S3

(Replicated object storage)

AWS FSx ONTAP

(offers replicated RWO, RWX)

OpenShift® Data Foundation

(offers replicated RWO, RWX)

Portworx

(offers replicated RWO, RWX)

IBM Storage Fusion

(offers replicated RWO, RWX)

Business Automation Insights Yes (Kafka and OpenSearch) Yes   Yes Yes Yes Yes
Note:

AWS EFS requires NFS subdir external provisioner External link opens a new window or tab.

Use the Trident ontap-nas driver.

ODF is not currently supported for use with the ROSA managed OpenShift platform, see ODF support on ROSA External link opens a new window or tab.

Some components require both RWO and RWX for different purposes.

The following table shows an example of CP4BA storage in the public cloud on Azure.

Table 4. Storage example of Azure
Component Azure Disks

(single-zone or replicated RWO)

Azure Files

(replicated File RWX)

Azure Blob

(replicated object storage)

OpenShift Data Foundation

(offers replicated RWO, RWX)

Portworx

(offers replicated RWO, RWX)

IBM Storage Fusion

(offers replicated RWO, RWX)

Business Automation Insights Yes (Kafka and OpenSearch) Yes   Yes Yes Yes
Note:

Use the Azure Files NFS driver, and not the SMB driver.

Storage classes

A StorageClass object describes and classifies dynamically provisioned storage that can be requested on demand. The objects can also be used to manage and control access to the storage. Cluster administrators define and create the objects that users can request without needing to know all the details about the underlying storage sources.

For more information about storage class parameters, see Product Documentation for Red Hat® OpenShift Container Storage External link opens a new window or tab. For example, to allow a deployment to be deleted and redeployed without losing the data and files of a deployment use reclaimPolicy: Retain. For cloud platforms where a group owner of the file system is needed, use gidAllocate: "true" to request one.

Example YAML files to create storage classes on Red Hat OpenShift Kubernetes Service (ROKS) are provided in the cert-kubernetes/descriptors folder. For more information about downloading cert-kubernetes, see Preparing a client to connect to the cluster.

Note: You can get the existing storage classes in the environment by running the following command:
kubectl get storageclass

Take note of the storage classes that you want to use for your deployment.