Planning your analytics deployment

Plan your API Connect analytics deployment by reviewing options for deployment profile and estimating required persistent storage space.

Deployment profile

Decide which deployment profile you want to use, the available profiles are: For a scalable analytics deployment, choose a three replica profile. If you do not expect a high API transaction rate and plan to keep a limited amount of analytics data, then a one replica profile is sufficient.

Storage class

Before you install analytics, you must choose a storage class. Ceph, Block, and Local Volume are all supported; however, Local Volume is the most suitable for analytics. The analytics subsystem provides internal data replication and HA support, so the additional HA capabilities that Ceph or Block storage might give you are not needed. In addition, analytics requires a high throughput of disk I/O operations to successfully handle the analytics data load from the gateway. Local Volume is the best-performing storage class for disk I/O.

Note: GlusterFS and NFS storage are not supported for analytics. If you use either of these storage classes, you might encounter severe performance degradation, and possibly loss of data.

Storage type

Note: Storage type selection is available on three replica deployments only.
To make an informed choice on which storage type is best for you, first read and understand the OpenSearch concepts that are described in OpenSearch nodes, indices, shards, and replicas.
API Connect provides two OpenSearch storage types:
  • Shared storage

    A single pod provides data storage and OpenSearch cluster management. Shared storage is the default option, and the only option on a one replica deployment.

  • Dedicated storage
    OpenSearch data storage and cluster management functions are split, and run in separate pods on each worker node:
    • The storage-os-master pod runs an OpenSearch manager eligible node that manages the OpenSearch cluster, but does not store any API event data.
    • The storage pod runs an OpenSearch data node that stores API event data.
Dedicated storage facilitates horizontal scaling to support greater analytics data storage and throughput. Add worker nodes and increase the number of storage pod replicas to scale horizontally (you do not need to increase the number of storage-os-master replicas). Dedicated storage also provides greater stability and allows some OpenSearch configuration changes to be made without downtime.

You can change storage type after installation, see Dedicated storage and scaling up.

Disabling internal storage

The analytics subsystem provides a complete solution for routing, storing, and viewing the analytics data. However, you might not need the internal storage and viewing solution if you are using a third-party system for data offloading. In this scenario, you can greatly reduce your CPU and memory costs by disabling the internal storage and viewing components.

When you disable internal storage for analytics data, the following microservices are disabled:

  • osinit
  • storage
Important:

If you disable local storage, the analytics views in your API Connect UIs remain accessible, but are empty.

Configure local storage disablement before installation. It is not possible to disable or re-enable internal storage after installation.

You can enable offload of analytics data to third-party systems after installation.

For the steps to disable internal storage, see Disable local storage.

Persistent storage space

Based on your expected API transaction rate, you can estimate how much storage space your analytics subsystem requires. See Estimating storage requirements.

Inter-subsystem communication security

By default the network communication from the management and gateway subsystems, to the analytics subsystem uses Kubernetes ingresses or OpenShift routes, and is secured with mTLS. You can configure alternative network communication options if your environment requires. For more information, see Network requirements for inter-subsystem communication.

Kubernetes operating map counts

When the analytics service is configured to store data, it uses OpenSearch, which requires map counts higher than the operating system defaults. The minimum recommended value is 262144. Unless you plan to disable local storage, increase the default map count on every Kubernetes worker node:
  1. To change the map counts on the live system, run the following command on every Kubernetes node:
    sudo sysctl -w vm.max_map_count=262144
  2. To persist this change when node restarts occur, add the following setting to the /etc/sysctl.conf file:
    vm.max_map_count = 262144
For more information, see Important settings in the OpenSearch documentation.

PV requirements

The number of PVs needed depends on if you disable local storage.
Table 1. Analytics PV requirements
Deployment profile Local storage enabled Local storage disabled (ingestion only)
Single node (n1) 2 1
Three node (n3) 6 3
To understand how much storage capacity your analytics subsystem needs, see Estimating storage requirements.

Firewall requirements

The analytics firewall requirements are documented here: Firewall requirements