Self-hosted PoP Capacity planning and scaling

Capacity planning

You might want to know, how to plan the capacity of Self-hosted Synthetic PoP deployment, and how to scale up PoP to support an increasing workload. The PoP size and resource depend on the workload, such as the number of Synthetic tests, the test execution frequency, and the test script complexity. A good capacity plan can provide you with better performance and stability at the lowest cost. Instana provides a reference for different PoP size and resource allocations based on measurements and projections from standard IBM benchmarks in a controlled environment. Actual throughput or performance varies due to many factors, follow these steps to improve PoP performance:

  • Planning enough resource allocation based on workload and PoP size estimation.
  • Monitoring PoP health status with Instana Agent in a production environment.
  • Scaling up specific playback engines on demand to support an increasing workload.

If you use the default CPU and memory resource allocation for a Synthetic PoP, ensure that your Kubernetes cluster has at least a 6 core CPU and 4.8 GB memory available in its physical resources, then you can run 2000 API Simple tests, 100 API Script tests, 15 Browser Script tests, and 2000 ISM tests, based on the performance benchmark testing with the following test configuration:

Test type Frequency Duration Complexity Number of tests
API Simple test 1 minute ~ 200 ms 2000
API Script test 5 minutes ~ 800 ms Issuing 5 HTTP calls 100
Browser Script test 15 minutes ~ 20 seconds Opening 2 Web pages 15
ISM test 1 minute ~ 240 ms 2000

To install the Instana Agent to monitor your PoP or to scale up your PoP to improve the performance, more physical resources must be available in the Kubernetes cluster.

The estimation of PoP size and resource allocation is done by testing with the constant Synthetic test and the test frequency as shown in the table, using the resource requests and limits from each component's default configuration. For information on the default configuration of Synthetic PoP components, see PoP helm charts. Similarly, for information on the default configuration of the Instana Agent and Kubernetes (K8) sensor, see Agent helm charts. The number of playback engine replicas is calculated based on the number of tests and their frequency of execution. The number of Instana agents is calculated based on the number of Kubernetes worker nodes. In the default configuration, the number of Kubernetes sensors is set to 3. With 3 replicas of Kubernetes sensors, you can monitor up to 400 nodes.

XSmall installation - to test PoP capabilities

XSmall installation is supported for test or demo purposes. To use the default configuration, if your Kubernetes cluster has a 6 core CPU and 4.8 GB memory available in its physical resources, then you can run 10 API Simple tests, 5 API Script tests, 1 Browser Script test, and 10 ISM tests based on the performance benchmark testing with the constant test configuration.

To use the Instana Agent to monitor Synthetic PoP, ensure that a Kubernetes cluster with 1 worker node has 8 core CPU and 7.1 GB memory available in its physical resources.

Small installation - production

Small installation is for production. To use the default configuration, if your Kubernetes cluster has a 6 core CPU and 4.8 GB memory available in its physical resources, then you can run 2000 API Simple tests, 20 API Script tests, 5 Browser Script tests, and 2000 ISM tests based on the performance benchmark testing with the constant test configuration.

To use the Instana Agent to monitor Synthetic PoP, ensure that a Kubernetes cluster with 3 worker nodes has 12 core CPU and 11.7 GB memory available in its physical resources.

Medium installation - production

Medium installation is for production. To support this workload, you need to scale up playback engines horizontally, tune PoP controller parameters and increase CPU and Memory limit. If your Kubernetes cluster has at least a 28.4 core CPU and 21.6 GB memory available in its physical resources, then you can run 3500 API Simple tests, 250 API Script tests, 80 Browser Script tests, and 3500 ISM tests based on the performance benchmark testing with the constant test configuration.

To use the Instana Agent to monitor Synthetic PoP, ensure that a Kubernetes cluster with 3 worker nodes has 34.4 core CPU and 28.5 GB memory available in its physical resources.

Large installation - production

Large installation is for production. To support this workload, you need to scale up playback engines horizontally, tune PoP controller parameters and increase CPU and Memory limit. If your Kubernetes cluster has at least a 64 core CPU and 48.5 GB memory available in its physical resources, then you can run 7000 API Simple tests, 600 API Script tests, 200 Browser Script tests, and 7000 ISM tests based on the performance benchmark testing with the constant test configuration.

To use the Instana Agent to monitor Synthetic PoP, ensure that a Kubernetes cluster with 6 worker nodes has 74.5 core CPU and 57.7 GB memory available in its physical resources.

Estimate PoP size with automation tool

Instana provides an automation tool synctl to help you estimate PoP size and plan capacity. You can use the tool as shown in the following examples:

synctl get pop-size

synctl get size

Scaling up

To support increasing workloads, scale up PoP components as follows:

  • Horizontal scaling: a simpler approach is provided to increase the replica number of different playback engines.
  • Vertical scaling: increase the requests and limits numbers of CPU and memory.

Horizontal scaling

Currently, only playback engines support horizontal scalability. PoP Controller and Redis do not support horizontal scalability yet. The PoP workload is mainly on the playback engine pods. To support more workload on a certain type of test, you can increase the replicas number in the values.yaml file.

The following example shows that the replicas of JavaScript playback engine are increased to 2 and replicas of BrowserScript playback engine are increased to 3:

--set javascript.replicas=2
--set browserscript.replicas=3

Vertical scaling

Because JavaScript playback engine and BrowserScript playback engine are resource-intensive, you can increase the requests and limits numbers of CPU or memory to support more tests.

For example, if the requests and limits of the CPU are set to 800 m and 1000 m (0.8/1.0 Core) in the JavaScript playback engine, 40 API scripts can be supported with the previously described test configuration.

Performance tuning

To support workloads above 2000 Synthetic tests per minute, tune the PoP controller as follows:

  • To support more tests with user credentials, tune the controller.scheduleTestMaxPoolSize parameter to a higher value. To support this parameter, upgrade to Synthetic PoP 1.1.2 or later.
  • To support more results that are sent to the Instana backend, tune the controller.publishAARThreadPoolSize parameter to a higher value.

If you tune the parameters controller.resources.limits.cpu to 500 m, controller.resources.limits.memory to 500 Mi, controller.publishAARThreadPoolSize to 40, and controller.scheduleTestMaxPoolSize to 15, the PoP controller can support 9000 API Simple tests with user credentials per minute.