Using the BeeInstana on Kubernetes
BeeInstana is a metric database to perform more complex metric queries. Some functionality within Instana is dependent on BeeInstana to store metric data.
Self-hosted Instana environments on Kubernetes can be configured to use BeeInstana on Kubernetes as described in the Run and configure BeeInstana via Operator topic.
- Prerequisites
- Architecture
- Sizing BeeInstana on Kubernetes
- Deploying the BeeInstana Kubernetes Operator
- Deploying BeeInstana on Kubernetes
- Configuration options
- Determining the file system group ID on Red Hat OpenShift
Prerequisites
Before you deploy BeeInstana, you need to set up data stores as high-performing and distributed data store cluster. For more information, see Using third-party Kubernetes Operators.
Architecture
BeeInstana has four components:
- Kubernetes Operator: The Kubernetes Operator handles deploying, updating, and resizing the BeeInstana as configured.
- Configuration service: The configuration service provides the cluster information to the aggregators and the ingestors.
- Ingestor: The ingestor uses the metric data from a Kafka topic, processes it, and sends it to the appropriate aggregator.
- Aggregator: The aggregator writes the metric data to the disk and processes the queries that are made for the stored data.
Sizing BeeInstana on Kubernetes
The amount of resources that are needed for the BeeInstana is based on the number of metrics that need to be stored. If this number is not known in advance, then you can calculate as follows to get a rough estimate as a starting point:
number_of_metrics = number_of_hosts * 100 * 50
To store 2 million metrics with one aggregator and one ingestor, start with the following resources:
Component | CPU | RAM | Volume Size |
---|---|---|---|
Ingestor | 16 cores | 12 GiB | None |
Aggregator | 16 cores | 128 GiB | 5 TiB |
You can modify these resources to suit the environment when the BeeInstana is running.
For the ingestor, it is best to allocate double the amount of average CPU that is used to account for spikes in the incoming metrics. The memory usage is consistent normally, but an extra memory is needed to keep the data in memory when an aggregator is being updated.
Ingestors can be scaled horizontally. You can add or remove the replicas, as needed.
For the aggregator, it is best to allocate double the amount of average CPU that is used to account for user-driven requests. The aggregator makes extensive use of the file system cache and eventually uses all the memory that is provided. Check the total memory resident set size to determine the memory that the aggregator requires. The amount of memory must be double the amount of the resident set size, or else it affects the query performance and the stability.
Aggregator pods are organized into shards. Each aggregator pod within a shard is referred to as a mirror, and stores its own copy of the data in the shard.
Aggregators can be scaled up horizontally by increasing the number of shards. The number of shards can be increased, but never decreased. The aggregator does not support decreasing the number of shards.
The number of mirrors in an aggregator shard affects the availability of BeeInstana. Set the number of mirrors to 2. In this way, the cluster has no loss of data or availability problems when one aggregator mirror is down at a time.
Note: If the number of metrics remains constant, then the aggregator volume size will be steady after 13 months.
Deploying the BeeInstana Kubernetes Operator
The BeeInstana Kubernetes Operator is provided as a Helm chart.
To acquire the Helm chart and deploy the BeeInstana Kubernetes Operator, complete the following steps:
-
On the host machine that is used to access the Kubernetes cluster, add the Instana Helm repository by using the following command. Use the Instana agent key that was provided to you as the
agent_key
in the command:helm repo add instana https://helm.instana.io/artifactory/rel-helm-customer-virtual --username _ --password <agent_key>
-
On the Kubernetes cluster, create a namespace to deploy the BeeInstana by using the following command:
kubectl create namespace beeinstana
-
Create an image pull secret in the created namespace by using the following command, and replace the
agent_key
in the command with the Instana agent key that was provided to you:kubectl create secret docker-registry instana-registry --namespace=beeinstana --docker-server=artifact-public.instana.io --docker-username _ --docker-password=<agent_key>
-
Inspect the Kubernetes resources before they are deployed by using the following command:
helm template instana/beeinstana-operator --name-template=beeinstana --namespace=beeinstana
-
Perform one of the following steps:
-
For a standard Kubernetes cluster or a cluster on Red Hat OpenShift 4.10, deploy the Operator into the created namespace by using the following command:
helm install beeinstana instana/beeinstana-operator --namespace=beeinstana
-
For a cluster on Red Hat OpenShift 4.11 and later versions, deploy the Operator into the created namespace by using the following command:
helm install beeinstana instana/beeinstana-operator --namespace=beeinstana --set operator.securityContext.seccompProfile.type=RuntimeDefault
-
-
Retrieve the pods in the created namespace by using the following command:
kubectl get pod --namespace=beeinstana
-
Validate that one BeeInstana Operator pod is present and running. See the following example:
beeinstana-beeinstana-operator-569999cbfc-5ftf8 1/1 Running 0 1m
Deploying BeeInstana on Kubernetes
After you deploy the BeeInstana Kubernetes Operator, you can deploy the BeeInstana. Create a file to store the configuration of BeeInstana. Save this configuration file, and use this file to update BeeInstana in the future. You can start with the following configuration:
apiVersion: beeinstana.instana.com/v1beta1
kind: BeeInstana
metadata:
name: instance
namespace: beeinstana
spec:
version: 1.1.2
config:
cpu: 200m
memory: 200Mi
replicas: 1
ingestor:
brokerList: cluster.kafka.svc:9092
cpu: 8
memory: 4Gi
limitMemory: true
env: on-prem
metricsTopic: raw_metrics
replicas: 1
aggregator:
cpu: 4
memory: 16Gi
limitMemory: true
mirrors: 2
shards: 1
volumes:
live:
size: 2000Gi
storageClass: standard
Change the configuration as necessary. Ensure that the spec.ingestor.brokerList
option is set to the network addresses of the Kafka brokers. For more information, see the configuration options.
To deploy BeeInstana with the configuration file that was created, complete the following steps:
-
Apply the created configuration file into the same namespace where the BeeInstana Kubernetes Operator is deployed, as shown in the following command:
kubectl apply -f /path/to/configuration/file.yaml --namespace beeinstana
-
Wait for the BeeInstana Operator to deploy BeeInstana. To validate the installation, get the pod status in the same namespace by running the following command:
kubectl get pods --namespace beeinstana
Validate that pods for all of the components are running. See the following example:
NAME READY STATUS RESTARTS AGE aggregator-0-0 1/1 Running 0 2m aggregator-0-1 1/1 Running 0 75s beeinstana-beeinstana-operator-79c8fd74c8-vchh6 1/1 Running 0 169m config-546999994-rfvnc 1/1 Running 0 31m ingestor-9c569cbf9-ncrzs 1/1 Running 0 39s
-
Check the deployment by running the following command:
kubectl get beeinstana instance --namespace beeinstana -o yaml
If the deployment status is successful, the
status.reconciledAt
property is a recent timestamp within the past minute, thestatus.version
property is the configured version, and thestatus.canBeReconciled
property istrue
. See the following example:... status: ... canBeReconciled: true reconciledAt: "2023-03-30T22:47:08Z" version: 1.1.2
Then, you need to configure Instana to use the deployed BeeInstana. For more information, see Run and configure BeeInstana via Operator.
Configuration options
You can customize the values of the following keys in the configuration file that is used for deploying the BeeInstana on Kubernetes:
Note: Many of the following keys are related to sizing. For how to determine an appropriate value for a key, see the sizing section.
Key | Value Type | Description |
---|---|---|
spec.version |
string | Determines the version of the BeeInstana to be deployed. |
spec.fsGroup |
integer | Determines the file system group set for persistent volumes. On Red Hat OpenShift, the value must be set as described in the Determining the file system group ID on Red Hat OpenShift topic. Otherwise, leave the value unset. |
spec.seccompProfile.type |
string | Determines the seccomp profile type that is used. On Red Hat OpenShift 4.11 and later versions, the value must be set to RuntimeDefault . On Red Hat OpenShift 4.10 and earlier versions, leave the value unset. |
spec.aggregator.cpu |
string | Determines the CPU that is requested by the aggregator pods. |
spec.aggregator.memory |
string | Determines the memory that is requested by the aggregator pods. |
spec.aggregator.limitMemory |
boolean | Limits the memory that an aggregator pod is allowed to use. Set this value to true if the aggregator pod is running on a shared node. |
spec.aggregator.mirrors |
integer | Determines the amount of data replication and availability. When this value is set to 2 and one aggregator pod goes down, BeeInstana can tolerate it without impacting availability. |
spec.aggregator.shards |
integer | Determines the number of shards in the cluster and is used to horizontally scale the aggregator. The value of this key cannot be decreased. |
spec.aggregator.volumes.live.size |
string | Determines the initial size of the persistent volume claim when the aggregator pods are created. To increase the size of the persistent volume claim after creation, modify the claim directly. |
spec.aggregator.volumes.live.storageClass |
string | Determines the storage class of the persistent volume claim. |
spec.ingestor.brokerList |
string | A comma-separated list of the Kafka broker network addresses. |
spec.ingestor.cpu |
string | Determines the CPU that is requested by the ingestor pods. |
spec.ingestor.memory |
string | Determines the memory that is requested by the ingestor pods. |
spec.ingestor.limitMemory |
boolean | Limits the memory that an ingestor pod is allowed to use. Set this option to true if the ingestor pod is running on a shared node. |
spec.ingestor.replicas |
integer | Used to horizontally scale the ingestor. The value of this key can be increased or decreased as you need. |
spec.ingestor.env |
string | Used as a part of the Kafka consumer name to ensure uniqueness. |
spec.ingestor.metricsTopic |
string | Determines the Kafka topic name that the ingestor pods consume from. |
spec.config.cpu |
integer | Determines the CPU that is requested by the configuration service pods. |
spec.config.memory |
string | Determines the memory that is requested by the configuration service pods. |
spec.config.replicas |
integer | Controls the availability of the configuration service pods. |
Determining the file system group ID on Red Hat OpenShift
Red Hat OpenShift requires that file system groups are within a range of values specific to the namespace. On the cluster where the BeeInstana Operator was deployed, run the following command:
kubectl get namespace beeinstana -o yaml
An output similar to the following example is shown for the command:
apiVersion: v1
kind: Namespace
metadata:
annotations:
...
openshift.io/sa.scc.supplemental-groups: 1000000000/10000
name: beeinstana
The openshift.io/sa.scc.supplemental-groups
annotation contains the range of allowed IDs. The range 1000000000/10000 indicates 10,000 values starting with ID 1000000000, so it specifies the range of IDs from 1000000000 to 1000009999.
In this example, the value 1000000000 might be used as a file system group ID.