Using the BeeInstana on Kubernetes

BeeInstana is a metric database to perform more complex metric queries. Some functionality within Instana is dependent on BeeInstana to store metric data.

Self-hosted Instana environments on Kubernetes can be configured to use BeeInstana on Kubernetes as described in the Run and configure BeeInstana via Operator topic.

Prerequisites

Before you deploy BeeInstana, you need to set up data stores as high-performing and distributed data store cluster. For more information, see Using third-party Kubernetes Operators.

Architecture

BeeInstana has four components:

  • Kubernetes Operator: The Kubernetes Operator handles deploying, updating, and resizing the BeeInstana as configured.
  • Configuration service: The configuration service provides the cluster information to the aggregators and the ingestors.
  • Ingestor: The ingestor uses the metric data from a Kafka topic, processes it, and sends it to the appropriate aggregator.
  • Aggregator: The aggregator writes the metric data to the disk and processes the queries that are made for the stored data.

Sizing BeeInstana on Kubernetes

The amount of resources that are needed for the BeeInstana is based on the number of metrics that need to be stored. If this number is not known in advance, then you can calculate as follows to get a rough estimate as a starting point:

number_of_metrics = number_of_hosts * 100 * 50

To store 2 million metrics with one aggregator and one ingestor, start with the following resources:

Component CPU RAM Volume Size
Ingestor 16 cores 12 GiB None
Aggregator 16 cores 128 GiB 5 TiB

You can modify these resources to suit the environment when the BeeInstana is running.

For the ingestor, it is best to allocate double the amount of average CPU that is used to account for spikes in the incoming metrics. The memory usage is consistent normally, but an extra memory is needed to keep the data in memory when an aggregator is being updated.

Ingestors can be scaled horizontally. You can add or remove the replicas, as needed.

For the aggregator, it is best to allocate double the amount of average CPU that is used to account for user-driven requests. The aggregator makes extensive use of the file system cache and eventually uses all the memory that is provided. Check the total memory resident set size to determine the memory that the aggregator requires. The amount of memory must be double the amount of the resident set size, or else it affects the query performance and the stability.

Aggregator pods are organized into shards. Each aggregator pod within a shard is referred to as a mirror, and stores its own copy of the data in the shard.

Aggregators can be scaled up horizontally by increasing the number of shards. The number of shards can be increased, but never decreased. The aggregator does not support decreasing the number of shards.

The number of mirrors in an aggregator shard affects the availability of BeeInstana. Set the number of mirrors to 2. In this way, the cluster has no loss of data or availability problems when one aggregator mirror is down at a time.

Note: If the number of metrics remains constant, then the aggregator volume size will be steady after 13 months.

Deploying the BeeInstana Kubernetes Operator

The BeeInstana Kubernetes Operator is provided as a Helm chart.

To acquire the Helm chart and deploy the BeeInstana Kubernetes Operator, complete the following steps:

  1. On the host machine that is used to access the Kubernetes cluster, add the Instana Helm repository by using the following command. Use the Instana agent key that was provided to you as the agent_key in the command:

    helm repo add instana https://helm.instana.io/artifactory/rel-helm-customer-virtual --username _ --password <agent_key>
    
  2. On the Kubernetes cluster, create a namespace to deploy the BeeInstana by using the following command:

    kubectl create namespace beeinstana
    
  3. Create an image pull secret in the created namespace by using the following command, and replace the agent_key in the command with the Instana agent key that was provided to you:

    kubectl create secret docker-registry instana-registry --namespace=beeinstana --docker-server=artifact-public.instana.io --docker-username _ --docker-password=<agent_key>
    
  4. If you are using Red Hat Openshift version 4.11 and later versions, create a SecurityContextConstraints YAML file as follows:

    apiVersion: security.openshift.io/v1
    kind: SecurityContextConstraints
    metadata:
      name: beeinstana-scc
    runAsUser:
      type: MustRunAs
      uid: 1000
    seLinuxContext:
      type: RunAsAny
    fsGroup:
      type: RunAsAny
    allowHostDirVolumePlugin: false
    allowHostNetwork: true
    allowHostPorts: true
    allowPrivilegedContainer: false
    allowHostIPC: true
    allowHostPID: true
    readOnlyRootFilesystem: false
    users:
      - system:serviceaccount:beeinstana:beeinstana-aggregator
      - system:serviceaccount:beeinstana:beeinstana-beeinstana-operator
      - system:serviceaccount:beeinstana:beeinstana-config
    
  5. To deploy the BeeInstana Operator in a Red Hat Openshift cluster, you need to apply the SecurityContextConstraints (SCC) before you apply the BeeInstana Operator deployment. Run the following command:

    kubectl apply -f <yaml_file>
    

    Note: Replace <yaml_file> with the name of the SecurityContextConstraints YAML file that you created in step 4.

  6. Inspect the Kubernetes resources before they are deployed by using the following command:

    helm template instana/beeinstana-operator --name-template=beeinstana --namespace=beeinstana
    
  7. Perform one of the following steps:

    • For a standard Kubernetes cluster or a cluster on Red Hat OpenShift 4.10, deploy the BeeInstana Kubernetes Operator into the created namespace by using the following command:

      helm install beeinstana instana/beeinstana-operator --namespace=beeinstana
      
    • For a cluster on Red Hat OpenShift 4.11 and later versions, deploy the BeeInstana Kubernetes Operator into the created namespace by using the following command:

      helm install beeinstana instana/beeinstana-operator --namespace=beeinstana --set operator.securityContext.seccompProfile.type=RuntimeDefault
      
  8. Retrieve the pods in the created namespace by using the following command:

    kubectl get pod --namespace=beeinstana
    
  9. Validate that one BeeInstana Kubernetes Operator pod is present and running. See the following example:

    beeinstana-beeinstana-operator-569999cbfc-5ftf8   1/1     Running   0          1m
    

Updating the BeeInstana Kubernetes Operator

To update the BeeInstana Kubernetes Operator after it is installed as described in the Deploying the BeeInstana Kubernetes Operator section, complete the following steps:

  1. Retrieve updates from the Instana Helm repository that is added in step 1 of the Deploying the BeeInstana Kubernetes Operator section by using the following command:

    helm repo update instana
    
  2. Perform one of the following steps:

    • For a standard Kubernetes cluster or a cluster on Red Hat OpenShift 4.10, update the BeeInstana Kubernetes Operator by using the following command:

      helm upgrade beeinstana instana/beeinstana-operator --namespace=beeinstana
      
    • For a cluster on Red Hat OpenShift 4.11 and later versions, update the BeeInstana Kubernetes Operator by using the following command:

      helm upgrade beeinstana instana/beeinstana-operator --namespace=beeinstana --set operator.securityContext.seccompProfile.type=RuntimeDefault
      
  3. Retrieve the pods in the namespace where BeeInstana is deployed by using the following command:

    kubectl get pod --namespace=beeinstana
    
  4. Validate that one BeeInstana Kubernetes Operator pod is present and running. See the following example:

    beeinstana-beeinstana-operator-569999cbfc-5ftf8   1/1     Running   0          1m
    

Note: Updating the BeeInstana Kubernetes Operator does not change the version of BeeInstana that is deployed. The version of BeeInstana that is deployed is configured by setting spec.version as described in the BeeInstana configuration options section.

Deploying BeeInstana on Kubernetes

After you deploy the BeeInstana Kubernetes Operator, you can deploy the BeeInstana. Create a file to store the configuration of BeeInstana. Save this configuration file, and use this file to update BeeInstana in the future. You can start with the following configuration:

apiVersion: beeinstana.instana.com/v1beta1
kind: BeeInstana
metadata:
  name: instance
  namespace: beeinstana
spec:
  version: 1.1.3
  kafkaSettings:
    brokers:
      - cluster.kafka.svc:9092
  config:
    cpu: 200m
    memory: 200Mi
    replicas: 1
  ingestor:
    cpu: 8
    memory: 4Gi
    limitMemory: true
    env: on-prem
    metricsTopic: raw_metrics
    replicas: 1
  aggregator:
    cpu: 4
    memory: 16Gi
    limitMemory: true
    mirrors: 2
    shards: 1
    volumes:
      live:
        size: 2000Gi
        storageClass: standard  # Optional field. You can assign a non-default StorageClass available in the cluster as needed. If you don't add this field, the default StorageClass is used.

Change the configuration as necessary. For more information, see the BeeInstana configuration options section.

To deploy BeeInstana with the configuration file that was created, complete the following steps:

  1. Apply the created configuration file into the same namespace where the BeeInstana Kubernetes Operator is deployed, as shown in the following command:

    kubectl apply -f /path/to/configuration/file.yaml --namespace beeinstana
    
  2. Wait for the BeeInstana Kubernetes Operator to deploy BeeInstana. To validate the installation, get the pod status in the same namespace by running the following command:

    kubectl get pods --namespace beeinstana
    

    Validate that pods for all of the components are running. See the following example:

    NAME                                              READY   STATUS    RESTARTS   AGE
    aggregator-0-0                                    1/1     Running   0          2m
    aggregator-0-1                                    1/1     Running   0          75s
    beeinstana-beeinstana-operator-79c8fd74c8-vchh6   1/1     Running   0          169m
    config-546999994-rfvnc                            1/1     Running   0          31m
    ingestor-9c569cbf9-ncrzs                          1/1     Running   0          39s
    
  3. Check the deployment by running the following command:

    kubectl get beeinstana instance --namespace beeinstana -o yaml
    

    If the deployment status is successful, the status.reconciledAt property is a recent timestamp within the past minute, the status.version property is the configured version, and the status.canBeReconciled property is true. See the following example:

    ...
    status:
      ...
      canBeReconciled: true
      reconciledAt: "2023-03-30T22:47:08Z"
      version: 1.1.3
    

Then, you need to configure Instana to use the deployed BeeInstana. For more information, see the Run and configure BeeInstana via Operator topic.

Note: These same steps are used to modify an existing deployment of BeeInstana.

BeeInstana configuration options

You can customize the values of the following keys in the configuration file that is used for deploying the BeeInstana on Kubernetes:

Note: Many of the following keys are related to sizing. For how to determine an appropriate value for a key, see the sizing section.

Key Value Type Description
spec.version string Determines the version of the BeeInstana to be deployed.
spec.fsGroup integer Determines the file system group set for persistent volumes. On Red Hat OpenShift, the value must be set as described in the Determining the file system group ID on Red Hat OpenShift section. Otherwise, leave the value unset.
spec.seccompProfile.type string Determines the seccomp profile type that is used. On Red Hat OpenShift 4.11 and later versions, the value must be set to RuntimeDefault. On Red Hat OpenShift 4.10 and earlier versions, leave the value unset.
spec.kafkaSettings.brokers list of strings A list of the Kafka broker network addresses.
spec.kafkaSettings.securityProtocol string Determines the security protocol that is used to connect to Kafka. If this value is not provided, then no security protocol is used.
spec.kafkaSettings.saslMechanism string Determines the SASL mechanism that is used when a value for spec.kafkaSettings.securityProtocol is provided.
spec.kafkaSettings.saslUsername string Determines the username that is used to connect to Kafka when a value for spec.kafkaSettings.securityProtocol is provided.
spec.kafkaSettings.saslPasswordCredential.secretName string The name of a Kubernetes secret that contains a password data field that is used to connect to Kafka when a value for spec.kafkaSettings.securityProtocol is provided.
spec.aggregator.cpu string Determines the CPU that is requested by the aggregator pods.
spec.aggregator.memory string Determines the memory that is requested by the aggregator pods.
spec.aggregator.limitMemory boolean Limits the memory that an aggregator pod is allowed to use. Set this value to true if the aggregator pod is running on a shared node.
spec.aggregator.mirrors integer Determines the amount of data replication and availability. When this value is set to 2 and one aggregator pod goes down, BeeInstana can tolerate it without impacting availability.
spec.aggregator.shards integer Determines the number of shards in the cluster and is used to horizontally scale the aggregator. The value of this key cannot be decreased.
spec.aggregator.volumes.live.size string Determines the initial size of the persistent volume claim when the aggregator pods are created. To increase the size of the persistent volume claim after creation, modify the claim directly.
spec.aggregator.volumes.live.storageClass string Determines the storage class of the persistent volume claim.
spec.ingestor.brokerList string A comma-separated list of the Kafka broker network addresses. This key is deprecated and is replaced with spec.kafkaSettings.brokers
spec.ingestor.cpu string Determines the CPU that is requested by the ingestor pods.
spec.ingestor.memory string Determines the memory that is requested by the ingestor pods.
spec.ingestor.limitMemory boolean Limits the memory that an ingestor pod is allowed to use. Set this option to true if the ingestor pod is running on a shared node.
spec.ingestor.replicas integer Used to horizontally scale the ingestor. The value of this key can be increased or decreased as you need.
spec.ingestor.env string Used as a part of the Kafka consumer name to ensure uniqueness.
spec.ingestor.metricsTopic string Determines the Kafka topic name that the ingestor pods consume from.
spec.config.cpu integer Determines the CPU that is requested by the configuration service pods.
spec.config.memory string Determines the memory that is requested by the configuration service pods.
spec.config.replicas integer Controls the availability of the configuration service pods.

Determining the file system group ID on Red Hat OpenShift

Red Hat OpenShift requires that file system groups are within a range of values specific to the namespace. On the cluster where the BeeInstana Kubernetes Operator was deployed, run the following command:

kubectl get namespace beeinstana -o yaml

An output similar to the following example is shown for the command:

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    ...
    openshift.io/sa.scc.supplemental-groups: 1000000000/10000
  name: beeinstana

The openshift.io/sa.scc.supplemental-groups annotation contains the range of allowed IDs. The range 1000000000/10000 indicates 10,000 values starting with ID 1000000000, so it specifies the range of IDs from 1000000000 to 1000009999. In this example, the value 1000000000 might be used as a file system group ID.