Self-hosted PoP capacity planning and scaling

Plan the capacity of your self-hosted Synthetic PoP deployment and scale up the PoP to support an increasing workload. The PoP size and resources depend on the workload, such as the number of Synthetic tests, the test execution frequency, and the test script complexity.

Capacity planning

A good capacity plan provides better performance and stability at the lowest cost. Instana provides a reference for different PoP sizes and resource allocations based on measurements and projections from standard IBM benchmarks in a controlled environment. Actual throughput or performance varies due to many factors. Follow these steps to improve PoP performance:

  • Plan enough resource allocation based on workload and PoP size estimation.
  • Monitor PoP health status with Instana Agent in a production environment.
  • Scale up specific playback engines on demand to support an increasing workload.

If you use the default CPU and memory resource allocation for a Synthetic PoP, ensure that your Kubernetes cluster has at least a 6 core CPU and 4.8 GB memory available in its physical resources, then you can run 2000 API Simple tests, 100 API Script tests, 15 Browser Script tests, and 2000 ISM tests (SSL/DNS), based on the performance benchmark testing with the following test configuration:

Test type Frequency Duration Complexity Number of tests
API Simple test 1 minute ~ 200 ms 2000
API Script test 5 minutes ~ 800 ms Issuing 5 HTTP calls 100
Browser Script test 15 minutes ~ 20 seconds Opening 2 Web pages 15
SSL test 1 minute ~ 240 ms 2000
DNS test 1 minute ~ 240 ms 2000

To install the Instana Agent to monitor your PoP or to scale up your PoP to improve the performance, more physical resources must be available in the Kubernetes cluster.

The estimation of PoP size and resource allocation is done by testing with the constant Synthetic test and the test frequency as shown in the table, using the resource requests and limits from each component's default configuration. For information about the default configuration of Synthetic PoP components, see PoP helm charts. Similarly, for information about the default configuration of the Instana Agent and Kubernetes (K8) sensor, see Agent helm charts. The number of playback engine replicas is calculated based on the number of tests and their frequency of execution. The number of Instana agents is calculated based on the number of Kubernetes worker nodes. In the default configuration, the number of Kubernetes sensors is set to 3. With 3 replicas of Kubernetes sensors, you can monitor up to 400 nodes.

Disk space requirements

The following table lists the minimum disk space requirements per PoP size for initial deployment. These requirements are specified per worker node and account for K3s installation, Helm installation, container image storage, and core Synthetic PoP component data.

For a default configuration PoP (CPU: 300m, memory: 300Mi), the actual measured disk usage:

  • K3s installation: ~1 GB
  • Helm installation: ~1 GB
  • Synthetic PoP pods (all containers): ~7 GB
  • Total: ~9 GB
PoP size Minimum disk space per node Worker nodes Total minimum disk space Usage details
XSmall 50 GB 1 50 GB K3s (~1 GB), Helm (~1 GB), PoP containers (~7 GB), buffer (~41 GB)
Small 100 GB 3 300 GB K3s (~1 GB), Helm (~1 GB), PoP containers (~20 GB), buffer (~78 GB) per node
Medium 200 GB 3 600 GB K3s (~1 GB), Helm (~1 GB), PoP containers (~40 GB), buffer (~158 GB) per node
Large 400 GB 6 2.4 TB K3s (~1 GB), Helm (~1 GB), PoP containers (~80 GB), buffer (~318 GB) per node
Note: These values represent baselines for infrastructure procurement planning. The actual disk usage for a default configuration PoP is approximately 9 GB, which includes K3s, Helm, and all Synthetic PoP containers. Actual disk usage increases with test execution frequency, log volume, number of monitored targets, and retention policies. The buffer space accounts for log files, temporary data, and system overhead. Monitor disk utilization regularly and expand capacity when usage exceeds 70% to maintain optimal performance.

XSmall installation - Test PoP capabilities

XSmall installation is supported for test or demo purposes. To use the default configuration, if your Kubernetes cluster has a 6 core CPU and 4.8 GB memory available in its physical resources, then you can run 10 API Simple tests, 5 API Script tests, 1 Browser Script test, and 10 ISM tests (SSL/DNS) based on the performance benchmark testing with the constant test configuration.

To monitor Synthetic PoP with the Instana Agent, ensure that the Kubernetes cluster meets the following minimum requirements:

Resource Requirement
Worker nodes 1
CPU 8 cores
Memory 7.1 GB
Disk space 50 GB per node

Diagram showing XSmall PoP deployment architecture with 1 worker node, including PoP Controller, Redis, and playback engines

Small installation - Production

Small installation is for production. To use the default configuration, if your Kubernetes cluster has a 6 core CPU and 4.8 GB memory available in its physical resources, then you can run 2000 API Simple tests, 20 API Script tests, 5 Browser Script tests, and 2000 ISM tests (SSL/DNS) based on the performance benchmark testing with the constant test configuration.

To monitor Synthetic PoP with the Instana Agent, ensure that the Kubernetes cluster meets the following minimum requirements:

Resource Requirement
Worker nodes 3
CPU 12 cores
Memory 11.7 GB
Disk space 100 GB per node

Diagram showing Small PoP deployment architecture with 3 worker nodes, including PoP Controller, Redis, and playback engines distributed across nodes

Medium installation - Production

Medium installation is for production. To support this workload, you need to scale up playback engines horizontally, tune PoP controller parameters, and increase CPU and memory limits. If your Kubernetes cluster has at least a 28.4-core CPU and 21.6 GB memory available in its physical resources, you can run 3500 API Simple tests, 250 API Script tests, 80 Browser Script tests, and 3500 ISM (SSL/DNS) tests based on the performance benchmark testing with the constant test configuration.

To monitor Synthetic PoP with the Instana Agent, ensure that the Kubernetes cluster meets the following minimum requirements:

Resource Requirement
Worker nodes 3
CPU 34.4 cores
Memory 28.5 GB
Disk space 200 GB per node

Diagram showing Medium PoP deployment architecture with 3 worker nodes, including scaled PoP Controller, Redis, and multiple playback engine replicas

Large installation - Production

Large installation is for production. To support this workload, you need to scale up playback engines horizontally, tune PoP controller parameters, and increase CPU and memory limits. If your Kubernetes cluster has at least a 64-core CPU and 48.5 GB memory available in its physical resources, you can run 7000 API Simple tests, 600 API Script tests, 200 Browser Script tests, and 7000 ISM (SSL/DNS) tests based on the performance benchmark testing with the constant test configuration.

To monitor Synthetic PoP with the Instana Agent, ensure that the Kubernetes cluster meets the following minimum requirements:

Resource Requirement
Worker nodes 6
CPU 74.5 cores
Memory 57.7 GB
Disk space 400 GB per node

Diagram showing Large PoP deployment architecture with 6 worker nodes, including highly scaled PoP Controller, Redis, and numerous playback engine replicas distributed across nodes

Estimate PoP size with automation tool

Instana provides an automation tool synctl to help you estimate PoP size and plan capacity. You can use the tool as shown in the following examples:

synctl get pop-size

synctl get size
 

Scaling up

To support increasing workloads, scale up PoP components as follows:

  • Horizontal scaling: a simpler approach is provided to increase the replica number of different playback engines.
  • Vertical scaling: increase the requests and limits numbers of CPU and memory.

Horizontal scaling

Currently, only playback engines support horizontal scalability. PoP Controller and Redis do not support horizontal scalability yet. The PoP workload is mainly on the playback engine pods. To support more workload on a certain type of test, you can increase the replicas number in the values.yaml file.

The following example shows that the replicas of JavaScript playback engine are increased to 2 and replicas of BrowserScript playback engine are increased to 3:

--set javascript.replicas=2
--set browserscript.replicas=3
 

Vertical scaling

Because JavaScript playback engine and BrowserScript playback engine are resource-intensive, you can increase the requests and limits numbers of CPU or memory to support more tests.

For example, if the requests and limits of the CPU are set to 800 m and 1000 m (0.8/1.0 Core) in the JavaScript playback engine, 40 API scripts can be supported with the previously described test configuration.

Performance tuning

To support workloads above 2000 Synthetic tests per minute, tune the PoP controller as follows:

  • To support more tests with user credentials, tune the controller.scheduleTestMaxPoolSize parameter to a higher value. To support this parameter, upgrade to Synthetic PoP 1.1.2 or later.
  • To support more results that are sent to the Instana backend, tune the controller.publishAARThreadPoolSize parameter to a higher value.

If you tune the parameters controller.resources.limits.cpu to 500 m, controller.resources.limits.memory to 500 Mi, controller.publishAARThreadPoolSize to 40, and controller.scheduleTestMaxPoolSize to 15, the PoP controller can support 9000 API Simple tests with user credentials per minute.