Storage considerations

Instances in Cloud Pak for Integration use persistent storage to provide reliable, resilient storage of state data. Before installing, the OpenShift cluster administrator must provide and configure appropriate storage classes that meet the requirements of the instances that you plan to use. For more information, see Understanding persistent storage.

Requirements for RWO storage

For instances that require RWO volumes, Cloud Pak for Integration supports any storage provider that meets all of the following conditions:

  • Is a block storage provider. In other words, you can't use any providers that have traditionally offered RWX storage (including NFS and GlusterFS) to provide RWO volumes.

  • Provides storage that is formatted as ext4 or XFS, to ensure POSIX compliance.

  • Supports dynamic volume provisioning and a volumeBindingMode value of WaitForFirstConsumer.

  • The provider is not documented as being explicitly prohibited from use as a block provider. NFS and GlusterFS are currently prohibited as block providers.

Storage providers

Important:
  • The storage providers in this list provision persistent volumes for use by the product. Not all features of a storage provider (such as snapshot-based backup and restore of the volumes) are supported.
  • The Cloud Pak for Integration support team does not directly support storage components. Ensure that there is an appropriate support arrangement in place, so that if an issue is identified within the storage provider, you can engage with that provider directly.
Tip:
  • Using volume encryption for your chosen storage provider protects your data at rest.
  • For RWO persistent volumes used with Cloud Pak for Integration, a single-zone RWO storage provider may provide the best performance and most cost-effective solution. Because instances that use RWO volumes provide built-in replication, it is not necessary to use a storage provider with cross-availability zone (AZ) replication (such as OpenShift Data Foundation, Portworx, or IBM Storage Fusion). In addition, using these types of storage providers for RWO volumes can result in increased costs for network data transfer between AZs (as in AWS), and reduced performance caused by unnecessary data replication. Instances that use RWX volumes do require that cross-AZ replication be carried out by the storage provider. For more information, see the section, "Using a software-defined storage (SDS) provider".

The following storage providers are validated across all instance types in Cloud Pak for Integration.

  • Cloud service for storage providers. For more information, see the applicable section: IBM Cloud, Amazon Web Services (AWS), or Microsoft Azure.

  • IBM Storage Fusion

  • IBM Storage Scale (for File RWX volumes only)

  • IBM Storage Suite for IBM Cloud Paks. This suite of offerings includes: IBM Storage Scale; block storage from IBM Spectrum Virtualize, FlashSystem, or DS8K; object storage from IBM Cloud Object Storage or Red Hat Ceph; Red Hat® OpenShift® Data Foundation (ODF).

  • Portworx Storage

  • Red Hat OpenShift Data Foundation (ODF) as a stand-alone offering. Use the version selector on the linked page to get the supported version that you want to review.
    Tip: For OpenShift Data Foundation in a production environment, the optimal minimum deployment is 4 storage nodes, with 3 Object Storage Daemons (OSDs) on each node. This provides much greater data resiliency in the event of any OSD failures.
  • RWO volumes from any storage provider that meets the Required RWO storage characteristics.

Storage type and volume requirements

For the following instance types, the table provides the corresponding storage providers, access modes, and number of required volumes (HA and non-HA).

For instances that are deployed on cloud services, also see Storage providers for instances on cloud services.

Important: Make sure that you also consider the requirements for Cloud Pak foundational services, which are in the last row of the table. Because the installation process for Cloud Pak foundational services automatically selects the default storage class for your cluster (which is specified in OpenShift by an administrator) confirm that this specified default storage class matches the requirements for Cloud Pak foundational services.

Table 1: Compatible storage types and access modes, and required volumes

Instance type Storage type * and access mode ** Number of volumes for non-HA *** Number of volumes for HA *** Notes
Platform UI N/A N/A N/A
Automation assets File RWX or Block RWO for asset data, Block RWO for CouchDB 1 3
  • Block volume must support the characteristics that are required by CouchDB as described in Using Automation assets (in the "Advanced configuration" section). For information on creating the instance with a block storage class for both the asset data and CouchDB, see Automation assets deployment by using the CLI.
  • One RWO volume per replica
API Connect cluster Block RWO 12 40
  • For more information, see Deployment requirements
  • API Manager: 3 per node + 1 shared, API Portal: 5 per node, API Analytics: 2 per node, API Gateway: 1 per node for non-HA, 3 per node for HA
Event Manager Block RWO
Event Gateway N/A
Event Processing Block RWO
Flink File RWX
Integration dashboard File RWX or S3 object storage N/A N/A
Integration runtime N/A N/A N/A
  • No persistent storage volumes required
Integration design Block RWO, RWX, or S3 storage (see Notes column) 1 3
  • For required details, see App Connect Designer storage. The new incremental learning feature, which uses an AI model, requires a persistent volume with ReadWriteMany (RWX) access mode or S3 storage.
  • One RWO volume per CouchDB replica
High-speed transfer server File RWX N/A N/A
  • RWX usage only
Kafka cluster Block RWO 2 6
  • Requires block storage that is configured to use the XFS or ext4 file system, as described in Event Streams storage.
  • One volume per Kafka broker and one per ZooKeeper instance (HA example is 3 broker, 3 ZooKeeper)
Messaging server Block RWO 1 3
  • Single instance and Native HA Messaging servers must use RWO access mode. For more information, see Planning storage for the IBM MQ Operator.
  • 1-3 volumes per underlying Queue manager, depending on the availability configuration
Queue manager Block or File, RWO (Native HA or single-instance) or RWX (Multi-instance) 1-3 3-9
  • MQ single-instance and native HA queue managers can use RWO access mode, while multi-instance queue managers require RWX as described in Planning storage for the IBM MQ Operator. MQ multi-instance queue managers require particular file system characteristics, which can be verified by using the instructions for Testing a shared file system for IBM MQ. A list of known compliant and noncompliant file systems and notes on other limits or restrictions can be found in the Testing statement for IBM MQ file systems.
  • 1-9 volumes per Queue manager, depending on the data separation that you want for space management and high availability configuration
Enterprise gateway N/A N/A
  • No persistent storage volumes required
 
Cloud Pak foundational services Block RWO 1 2

* None of the Cloud Pak for Integration components require raw block device storage. In Kubernetes terms, Volume mode in the Kubernetes documentation.

** Kubernetes access modes include Read Write Once (RWO), Read Write Many (RWX), and Read Only Many (ROX). For more information, see Access modes in the Kubernetes documentation.

*** The table includes the number of RWO volumes that are required by each instance in a typical configuration for a high availability (HA) or non-HA deployment. RWX volumes are excluded because they typically use network-attached storage patterns, and therefore are not directly attached to the node. For more information, see the section, Limits on number of persistent volumes for public cloud providers.

Storage providers on cloud services

Get the storage requirements for Cloud Pak for Integration on cloud services.

IBM Cloud

The following table lists a subset of common storage providers that can be used with IBM Cloud, and the compatibility of each instance type with these storage options.

When running IBM Cloud Pak® for Integration on IBM Cloud, you can use any of the validated storage providers. However, using IBM Cloud Block storage and IBM Cloud File storage provide the combination of best performance and most cost-effective solution.

Important: Make sure that you also consider the requirements for Cloud Pak foundational services, which are in the last row of the table. Cloud Pak foundational services is automatically installed when instances are created. Because the installation process for Cloud Pak foundational services automatically selects the default storage class for your cluster (which is specified in OpenShift by an administrator) confirm that this specified default storage class matches the requirements for Cloud Pak foundational services.

Table 2: Compatible supported storage providers for Cloud Pak for Integration on IBM Cloud, by instance type

Instance type IBM block storage IBM file storage
Platform UI N/A N/A
Automation assets Yes Yes
API Connect cluster Yes No
Event Manager Yes No
Event Gateway N/A N/A
Event Processing Yes No
Integration design Yes Yes
Integration runtime N/A N/A
Integration dashboard No Yes
High speed transfer server No Yes
Kafka cluster Yes No
Messaging server Yes No
Queue manager, single-instance Yes Yes
Queue manager, native HA* Yes Yes
Queue manager, multi-instance No Yes
Enterprise gateway N/A N/A
Cloud Pak foundational services Yes, Gold only No

Amazon Web Services (AWS)

The following table describes the compatibility of each instance type with a subset of common storage providers that can be used with AWS, including Amazon Elastic Block Store (AWS EBS) and Amazon Elastic File System (AWS EFS).

When running Cloud Pak for Integration on AWS, you can use any of the validated storage providers. However, using AWS EBS (for RWO volumes) and AWS EFS (for RWX) provides the combination of best performance and most cost-effective solution.

Important: Make sure that you also consider the requirements for Cloud Pak foundational services, which are in the last row of the table. Cloud Pak foundational services is automatically installed when instances are created. Because the installation process for Cloud Pak foundational services automatically selects the default storage class for your cluster (which is specified in OpenShift by an administrator) confirm that this specified default storage class matches the requirements for Cloud Pak foundational services.

Table 3: Compatible supported storage providers for Cloud Pak for Integration on Amazon AWS, by instance type

Instance type AWS EBS (RWO) AWS S3 (object storage) AWS EFS (RWX)
Platform UI N/A N/A N/A
Automation assets Yes N/A Yes
API Connect cluster Yes N/A No
Event Manager Yes N/A No
Event Gateway N/A N/A N/A
Integration design Yes Yes (AI feature only) Yes
Integration runtime N/A N/A N/A
Integration dashboard No Yes Yes
High-speed transfer server No N/A Yes
Kafka cluster Yes N/A No
Messaging server Yes N/A No
Queue manager, single-instance Yes N/A Yes *
Queue manager, native HA Yes N/A Yes *
Queue manager, multi-instance No N/A Yes *
Enterprise gateway N/A N/A N/A
Cloud Pak foundational services Yes N/A No

* Subject to locking considerations.

Important: Red Hat OpenShift Data Foundation is not supported on ROSA (Red Hat OpenShift] on AWS managed cloud). For more information, see OCS/ODF support on OSD and ROSA.

Microsoft Azure

The following table describes the compatibility of each instance type with a subset of common storage providers that can be used with Microsoft Azure, including Azure managed disks (Azure Disk) and Azure file shares (Azure Files).

When running Cloud Pak for Integration on Microsoft Azure, you can use any of the validated storage providers. However, Azure Disk and Azure Files provide the combination of best performance and most cost-effective solution.

Important: Make sure that you also consider the requirements for Cloud Pak foundational services, which are in the last row of the table. Cloud Pak foundational services is automatically installed when instances are created. Because the installation process for Cloud Pak foundational services automatically selects the default storage class for your cluster (which is specified in OpenShift by an administrator) confirm that this specified default storage class matches the requirements for Cloud Pak foundational services.

Table 4: Compatible supported storage options for Cloud Pak for Integration on Microsoft Azure, by instance type

Instance type Azure Disk Azure Files
Platform UI N/A N/A
Automation assets Yes Yes
API Connect cluster Yes No
Event Manager Yes No
Event Gateway N/A N/A
Event Processing Yes No
Integration design Yes Yes
Integration runtime N/A N/A
Integration dashboard No Yes
High speed transfer server No Yes
Kafka cluster Yes No
Messaging server Yes No
Queue manager, single-instance Yes No
Queue manager, native HA Yes No
Queue manager, multi-instance No No
Enterprise gateway N/A N/A
Cloud Pak foundational services Yes No

Limits on persistent volumes for public cloud providers

The number of block storage volumes that are permitted per public cloud region or data center is typically limited by default to prevent excessive usage. However, this number can be configured with a support ticket request to allow higher numbers to be created.

There are also Kubernetes default limits on the number of IaaS-provided block storage volumes that can be attached per worker node in each of the public clouds, as illustrated in the following table. For more information, see Node-specific Volume Limits in the Kubernetes documentation. These per-node volume limits mean that in some environments it is necessary to deploy more worker nodes to host particular instances than the CPU or memory resource requirements imply alone.

The volume limit applies only to block devices that are directly attached to nodes. The limit does not apply to Software Defined Storage providers such as Red Hat OpenShift Data Foundation and Portworx, or network file systems such as NFS and EFS, because they do not use direct block device attachment.

Table 5: Persistent volume limits for public cloud providers

Public cloud volume provider Volume limit per worker node Details
IBM Cloud Block storage for VPC 12 volumes VPC service limits
AWS Elastic Block Store (EBS) 11-39 volumes depending on instance type Instance volume limits
Azure Disk 4-64 as defined by the "max data disks" per type Azure VM sizes
Google Persistent Disk 16-128 "max number of persistent disks" per type Machine types

Using a software-defined storage (SDS) provider

Software-defined storage (SDS) typically involves virtualized data storage that is independent of any underlying hardware. If you currently use an SDS provider or are considering using one, you should consider the tradeoffs of using SDS that has cross-AZ replication (such as ODF, Portworx, or IBM Storage Fusion) with instances in Cloud Pak for Integration that use RWO persistent volume storage.

Instances configured for high-availability in Cloud Pak for Integration that use RWO storage do not require cross-AZ replicated storage as they already provide their own cross-AZ replication.

The implications of using a storage provider with cross-AZ replication for RWO volumes include reduced performance (caused by unnecessary data replication), and increased costs for network data transfer between AZs (as in AWS) and operations personnel. This is why RWO storage that does not have cross-AZ replication may provide the best performance and most cost-effective solution.

Tip: Instances that use RWX volumes are a separate case because they do require that cross-AZ replication be carried out by the storage provider.

The following diagram shows the cost and performance implications of using an SDS provider that has cross-AZ replicated storage, even though the instances already provide cross-AZ replication:

Figure 1. Cost and performance implications of Software Defined Storage (SDS) providers
Instances with built-in replication that use block RWO volumes, which have extra redundancy (3x2) of replication across AZs by an SDS provider that is not needed
Performance

Instances that use RWO (such as Queue managers with Native HA, or API Connect clusters) already provide their own replication, which results in two copies of the data being sent to other replicas to ensure resilience. When that data is written to the local block storage volume under each instance, a storage provider that uses cross-AZ replication also copies that data to two other locations per volume, which results in a total of eight copies. Typically, the system must wait for each of these write operations to synchronously complete before returning control to the original caller, thus reducing runtime performance.

Network costs

Providers such as AWS charge for data transfer between AZs in a region. If the storage provider is running in your customer account, as is the case with SDS providers, there is a direct financial benefit to right-sizing (in other words, reducing, where possible) the amount of replication that is taking place.

Operational costs

SDS providers require additional hardware capacity (for example, Red Hat OpenShift worker nodes) inside the customer’s account to operate the storage layer. They are also self-managed by the customer, so they have an ongoing cost for the operations personnel that are required to manage and maintain the storage layer.

SDS and replicated storage providers do offer additional flexibility for failover scenarios. A replica of an instance can restart in one of the other AZs if there is an AZ failure. However it is important to make an informed choice about the benefits and drawbacks.

The following diagram shows an alternative approach (without cross-AZ replication), with a storage provider (such as IBM Cloud Block, EBS, or Azure disk) that relies on the built-in replication of instances in Cloud Pak for Integration to achieve cross-AZ replication:

Figure 2. Storage provider without cross-AZ replication
Instances with built-in replication that use block RWO volumes, that have a storage layer with a copy on each AZ

Parent topic: