Tuning FileNet Content Manager components using automatic horizontal scaling

You can scale up or scale down the number of pods in a Content Platform Engine deployment by using the auto-scaling feature.

The operator supports a feature called automatic scaling, or auto-scaling. With this feature, the Content Platform Engine deployment can increase or decrease the number of deployed pods based on the workload resource requirements. When the Content Platform Engine deployment has a resource-intensive workload and the deployed pods already have a high CPU utilization, more pods are deployed up to the maximum set limit. If the deployment has a lighter workload and the deployed pods have a low CPU utilization, the number of deployed pods are reduced to the minimum set limit.

Note that the workloads that are handled by a Content Platform Engine deployment generally work best with vertical scaling, where more resources are added to pods, rather than horizontal scaling, with more pods added.

To allow the Content Platform Engine to handle varying workloads, the Horizontal Pod Autoscaler (HPA) created by the operator increases and decreases the number of pods automatically.

Decreasing the number of pods must be done slowly to ensure that work being performed by a specific pod completes before the pod is shut down. As a part of the shutdown period, the front-end Kubernetes load balancer routes the incoming client requests automatically to a pod that is still active. Enqueued background and asynchronous work are assigned to Content Platform Engine servers or pods by the system.

If a pod stops and the workload that is handled by the pod is unable to complete before the pod terminates, the Content Platform Engine reassigns the workload to another pod. This might delay the time that the workload takes to complete. Additionally, loading the required data for the assigned work into the cache of the new pod might take some time. This might result in an undesirable impact on the performance of the Content Platform Engine deployment if pods are scaled down too quickly.

If auto_scaling is enabled for FileNet Content Manager components such as CPE, ICN, and GraphQL in the CR, the operator creates an HPA for that component. Parameters allow the specification of minimum and maximum replicas. You can define the desired target CPU utilization to evaluate the need for scaling up or scaling down. If auto_scaling is enabled, the replicas parameter is ignored and the HPA controls the number of replicas.

The HPA created by the operator for Content Platform Engine deployment includes scaling policies that are designed to work well for typical workloads. The scaling policies control the rate of change of replicas while scaling up or down. These policies create a stabilization window that restricts the rapid change in the number of replicas when the average CPU utilization for the Content Platform Engine deployment keeps fluctuating. The stabilization window is used by the Kubernetes auto-scaling algorithm to consider the computed desired state based on past activity to prevent rapidly scaling up and down pods.

Autoscaling settings

In the HPA, periodSeconds indicates the length of time in the past for which the desired state holds true. The operator configures the HPA with a periodSeconds value of 300 seconds (5 minutes) for scale down and a periodSeconds value of 180 seconds (3 minutes) for scale-up.

Note: Content Search Services does not support the auto scaling feature. To add or remove CSS instances, modify the ecm_configuration.css.replicas parameter in the CR to direct the operator to alter the deployment.

Unlike Content Platform Engine, FileNet Content Manager client applications, such as IBM Business Automation Navigator and Content Services GraphQL do not have the same considerations of needing to scale down slowly. So, they use the default HPA scaling policies for the Kubernetes cluster.

Refer to the IBM Software Product Compatibility Reports for the specific Kubernetes versions that support this feature.

For more information about the Kubernetes HPA, see Horizontal Pod Autoscaler topic on the Kubernetes community website.