Populating deployment files

To deploy the Z APM Connect DG, populate the deployment files. If you are using Kubernetes, you can complete this task either by using Helm to create a Helm values file if Helm is installed, or by manually editing yaml manifest files and applying them to the cluster. If you are using OpenShift, you must use Helm to configure the values.yaml file. Helm is an industry standard tool that will help simplify the configuration, deployment, and maintenance of the Distributed Gateway deployment. You can also find the recommendations for pod limits and replicas on this page.

Using Helm to configure a values.yaml file

Before you begin

Make sure that Helm V3.1 or later is installed. For more information about installing Helm, see Installing Helm.

About this task

Helm uses one values file that is used by a Helm Chart to tell Helm about the wanted configuration. The values.yaml file needs to include the image repository to use, APM details, the names of the Secrets created in Deploying Z APM Connect DG in a cluster, and other settings. A sample values.yaml is included in the installation package.

Procedure

For production, the sample values.yaml is located at ./production/helm-deployment/values.yaml. It is recommended to create a backup copy before making edits so that you can refer back to default values.

In order to deploy the Distributed Gateway, you need to configure the parameters according to your needs. See the following example of which parameters are typically configured.
Tip: General tips when editing values.yaml
  • Spaces are important in yaml file. Each indentation needs to be two spaces.
  • It is highly recommended to use an editor that allows you to easily see spaces and search easily. A code editor with a yaml extension is recommended.

– Images

Navigate to the following Values Section:
## clusterType is for enabling extra APIs. Valid values = {kubernetes, openshift}
clusterType: <kubernetes or openshift>

## namespace is the namespace where all components will be deployed, it should be a fresh namespace for each Distributed Gateway deployment and must be created first
namespace: <ibm-zapm>

## logLevel is the log level that all Distributed Gateway components will run at.
## NOTE: This must be all uppercase
logLevel: "INFO"

# images Section contains all information regarding the images needed for Distributed Gateway deployment
images: 
Refer to the following table of parameter description to provide the configuration details relevant to your deployment.
Note:

Any variable in the values.yaml file that is not present in this table is optional for the OpenTelemetry (OTel) deployment.

Table 1. Configuration parameters for images section of values.yaml file
Parameter Description Options
clusterType Tells Helm whether or not to enable cluster specific APIs
  • openshift - routes must be created for each endpoint.
  • kubernetes - routes do not need to be created.
namespace The namespace where all components will be deployed, it should be a fresh namespace for each Distributed Gateway deployment and must be created first. <user input>
logLevel The log level that all Distributed Gateway components will run at. Must be all uppercase.
  • OFF
  • FATAL
  • ERROR
  • WARN
  • INFO
  • DEBUG
  • TRACE
  • ALL
imagePullSecrets

Name of OpenShift/Kubernetes secret used for authenticating with an external private registry where images will be pulled from.

regcred
zapmVersion The version of ZAPM to be deployed, will also be the default tag for all images. 6.2.0-0
pullPolicy Tells OpenShift/Kubernetes when to pull an image and when to use a cached image.
  • Always - always pull the image.
  • IfNotPresent - use cached images to save time.
  • Never - never pull the image.
transactionProcessorImageName The name of the Transaction Processor image stored in the image repository. <image registry>/zapm-transaction-processor
transactionProcessorImageTag Represents the tag assigned to the Transaction Processor image. It is defaulted to zapmVersion, and should only be changed if using a tag other than zapmVersion.
  • null - Preferred option.
  • <user input> - If using an image tag other than zapmVersion.
otelExporterImageName The name of the Zapm Exporter image stored in the image repository. <image registry>/zapm-exporter
otelExporterImageTag Represents the tag assigned to the ZAPM Exporter image. It is defaulted to zapmVersion, and should only be changed if using a tag other than zapmVersion.
  • null - Preferred option.
  • <user input> - If using an image tag other than zapmVersion.
testImageName The name of the Test image stored in the image repository. <image registry>/zapm-test
testImageTag Represents the tag assigned to the ZAPM Test image. It is defaulted to zapmVersion, and should only be changed if using a tag other than zapmVersion.
  • null - Preferred option.
  • <user input> - If using an image tag other than zapmVersion.
kafkaImageName The name of the Kafka image stored in the image repository. <image registry>/kafka
kafkaImageTag Represents the tag assigned to the Kafka image. It is defaulted to KAFKA_VERSION and should not be changed unless specified by IBM support. KAFKA_VERSION
redisImageName The name of the Redis image stored in the image repository. <image registry>/redis
redisImageTag Represents the tag assigned to the Redis image. It is defaulted to REDIS_VERSION and should not be changed unless specified by IBM support. REDIS_VERSION

– Network

Navigate to the following Values Section:

network:
Table 2. Configuration parameters for network section of values.yaml file
Parameter Description Options
zapmConnectBase.nodePort Defines the ports for ZAPM base container to ZAPM Distributed Gateway communication. Defaults to 30455. 30455
zapmConnectBase.advertisedHostname (OPENSHIFT ONLY) Defines the advertised hostname used to create the Openshift Ingress route (ingress.apps.<advertisedHostname>). <user input>
kafka.nodePort Defines the external port used for Kafka external communication between ZCEE, CTG, or CDP and the Distributed Gateway. Defaults to 30090. 30090
kafka.advertisedHostName Defines the fully qualified domain name that Kafka will listen on.
  • For k8s use the fully qualified domain name of the node (usually master node) the Kafka pod is deployed on.
  • For OCP set the value to the desire route such as kafka.apps.<Ingress_Domain>.
Note:

If you use a proxy such as HAProxy or Nginx to manage connections, then the advertisedHostname fields must match the relevant hostname of the proxy routes. The advertisedHostname fields must match to the port on which the proxy will be listening for traffic on externally. Configure the proxy to contact Kafka on port 30090.

– Security

Navigate to the following Values Section:

security:
Table 3. Optional configuration parameters for security section of values.yaml file
Parameter Description Options
kafka.externalTlsEnabled Enable TLS two-way authentication for any Kafka clients connecting using the advertised hostname. true or false
kafka.internalTlsEnabled Enable TLS two-way authentication for the DG pods running inside the cluster. true or false
kafka.secretName The name of the secret for Kafka internal or external TLS <user input>, ex. kafka-auth
zapmConnectBase.secretName The name of the secret containing the truststore and keystore for authentication between host and the Distributed Gateway. <user input>, ex. ingress-auth
otel.secretName The name of the secret that contains the public certificate for the OTel Collector if it requires TLS between the Distributed Gateway and the OTel Collector. <user input>, ex. otel-auth
otel.certName The name of the public certificate in the OTel secret. <user input>, ex. server.crt
redisSecret Name of the secret containing the Redis database password. <user input>
Important: The field must match the Redis database secret name created in the previous step: Pre-deployment configuration. For example, if you follow the given example in the previous step, the field should be redis-auth.

– Transaction Processor Values

Navigate to the following Values Section:

transactionProcessor:
Table 4. Configuration parameters for transactionProcessor section of values.yaml file
Parameter Description Options
replicationScaling.connection-manager Sets the number of Connection Manager Pods. <user input> (int)
replicationScaling.event-partitioner Sets the number of Event Partitioner Pods. <user input> (int)
replicationScaling.span-factory Sets the number of Span Factory Pods. <user input> (int)
replicationScaling.transaction-factory Sets the number of Transaction Factory Pods. <user input> (int)
replicationScaling.span-collector Sets the number of Span Collector Pods. <user input> (int)
replicationScaling.cdp-factory

Sets the number of CDP Factory Pods.

0
* Refer to Deploying Z APM Connect DG in a cluster information about where to find this value.

OTel Exporter Values

Navigate to the following Values Section:
otelExporter:
Table 5. Configuration parameters for otelExporter section of values.yaml file
Parameter Description Options
replicas Sets the number of OTel Exporter Pods. <user input> (int)
grpcEndpoint The gRPC acceptor endpoint set in your OTel Collector cofiguration file. Should be in the format: http(s)://<hostname>:<port>. Example: http://localhost:4317 <user input>
Note: If you are deploying OTel, then configure instanaExporter.replicas and ttg.controllers.replicas fields to 0.

Recommendations for Pod Limits and Replicas

It is recommended that you set pod limits and replicas for the deployed Z APM Connect components. The recommended values for replicas and resource limits are based on the performance test results. Refer to the table that matches your anticipated workload.

  • If the transactions are no more than 15k transactions per second, follow the recommended values in table 10.
  • If the transactions are between 15k to 30k per second, follow the recommended values in table 11.
  • Workloads that exceed 30k transactions per second may require an increase in replicas and resource limits to accommodate the higher transactions per second.
Note: Performance tests for Z APM Connect Distributed Gateway were conducted on an OpenShift cluster with 8 worker nodes. Each worker node was an x86 virtual machine with 16 cores and 25GB of RAM. The tests use simple traces that include a parent transaction and 2 to 3 child transactions.

The recommended values may require adjustments depending on the computing environment, including factors such as hardware specifications, operating system, and application workloads. For example, more complex transaction data will result in larger event sizes and may require adjustments.

Table 6. Suggested pod limits and replicas for each deployed component with ≤ 15k Transactions/second
Deployment Request CPU (Min) Request RAM (Min) Limit CPU (Max) Limit RAM (Max) Minimum Replicas
connection-manager 200m 300Mi 200m 300Mi 3
event-partitioner 700m 700Mi 700m 700Mi 3
span-factory 1000m 2Gi 1000m 2Gi 5
span-collector 500m 300Mi 500m 300Mi 5
transaction-factory 500m 800Mi 500m 800Mi 5
kafka 1500m 8Gi 1500m 8Gi 1
redis 4000m 7Gi 4000m 7Gi 1
otel-exporter 600m 800Mi 600m 1300Mi 3
Table 7. Suggested pod limits and replicas for each deployed component with 15k-30k Transactions/second
Deployment Request CPU (Min) Request RAM (Min) Limit CPU (Max) Limit RAM (Max) Minimum Replicas
connection-manager 200m 300Mi 200m 300Mi 6
event-partitioner 700m 1Gi 700m 1Gi 7
span-factory 1000m 2Gi 1000m 2Gi 15
span-collector 500m 1Gi 500m 1Gi 5
transaction-factory 500m 1Gi 500m 1Gi 5
kafka 8000m 23Gi 8000m 23Gi 1
redis 4000m 7Gi 4000m 7Gi 1
otel-exporter 600m 2Gi 1600m 2Gi 3

Kubernetes requests represent the minimum resources that a pod can have, while Kubernetes limits define the maximum resources. CPU resources are measured in millicores, where 1000m is equivalent to 1 CPU core. Memory is measured in bytes and expressed as mebibyte values, which are similar to megabytes.

Attention: For scaling purposes, you need to have at least one Z APM Connect Base Courier for each Connection Manager replica. For example, if you scale connection-manager to 4 replicas, you would need 4 Z APM Connect Base Couriers in order to fully utilize all 4 connection-manager replicas.
Attention: If you enable Internal Kafka TLS two-way authentication for your Distributed Gateway deployment, there will be increased performance overhead which will result in increased resource requirements.
Tip: To update replicas, modify the "replicas" parameter in the values.yaml file for Helm.