Creating a Kafka data store on Linux on Power (ppc64le)
Install the Kafka operator and set up the data store.
- Before you begin
- Kafka operator versions and image tags for deployment
- Installing Kafka online
- Installing Kafka offline
- Deploying and verifying Kafka (online and offline)
- Post-installation tasks for Kafka (online and offline)
Before you begin
Make sure that you prepared your online and offline host to pull images from the external repository. Also, ensure that the correct Helm repo is added.
For more information, see Preparing for data store installation.
Installing Kafka online
Complete these steps to install the Kafka data store.
-
Create the
instana-kafka
namespace.kubectl create namespace instana-kafka
-
Create image pull secrets for the Kafka operator and image, and update the
<download_key>
value with your download key.kubectl create secret docker-registry instana-registry \ --namespace=instana-kafka \ --docker-username=_ \ --docker-password=<download_key> \ --docker-server=artifact-public.instana.io
-
Install the Strimzi operator.
helm install strimzi-kafka-operator instana/strimzi-kafka-operator --version 0.38.0 -n instana-kafka --set image.registry=artifact-public.instana.io --set image.repository=self-hosted-images/3rd-party/strimzi --set image.name=operator --set image.tag=0.38.0_v0.7.0 --set image.imagePullSecrets[0].name="instana-registry" --set kafka.image.registry=artifact-public.instana.io --set kafka.image.repository=self-hosted-images/3rd-party/strimzi --set kafka.image.name=kafka --set kafka.image.tag=3.6.0_v0.7.0
-
Create a YAML file, for example
kafka.yaml
, with the Kafka configuration.apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: instana labels: strimzi.io/cluster: instana spec: kafka: version: 3.6.0 replicas: 3 listeners: - name: scram port: 9092 type: internal tls: false authentication: type: scram-sha-512 configuration: useServiceDnsDomain: true authorization: type: simple superUsers: - strimzi-kafka-user storage: type: jbod volumes: - id: 0 type: persistent-claim size: 200Gi deleteClaim: true zookeeper: replicas: 3 storage: type: persistent-claim size: 50Gi deleteClaim: true entityOperator: template: pod: tmpDirSizeLimit: 10Mi userOperator: image: artifact-public.instana.io/self-hosted-images/3rd-party/strimzi/operator:0.38.0_v0.7.0
-
Complete the steps in Deploying and verifying Kafka (online and offline).
Installing Kafka offline
If you didn't yet pull the ClickHouse images from the external registry when you prepared for installation, you can pull them now. Run the following commands on your bastion host. Then, copy the images to your Instana host that is in your air-gapped environment.
docker pull artifact-public.instana.io/self-hosted-images/3rd-party/strimzi/operator:0.38.0_v0.7.0
docker pull artifact-public.instana.io/self-hosted-images/3rd-party/strimzi/kafka:3.6.0_v0.7.0
Complete the following steps on your Instana host.
-
Retag the images to your internal image registry.
docker tag artifact-public.instana.io/self-hosted-images/3rd-party/strimzi/operator:0.38.0_v0.7.0 <internal-image-registry>/self-hosted-images/3rd-party/strimzi/operator:0.38.0_v0.7.0 docker tag artifact-public.instana.io/self-hosted-images/3rd-party/strimzi/kafka:3.6.0_v0.7.0<internal-image-registry>/self-hosted-images/3rd-party/strimzi/kafka:3.6.0_v0.7.0
-
Push the images to your internal image registry on your bastion host.
docker push <internal-image-registry>/self-hosted-images/3rd-party/strimzi/operator:0.38.0_v0.7.0 docker push <internal-image-registry>/self-hosted-images/3rd-party/strimzi/kafka:3.6.0_v0.7.0
-
Create the
instana-kafka
namespace:kubectl create namespace instana-kafka
-
Optional: Create an image pull secret if your internal image registry needs authentication.
kubectl create secret docker-registry <secret_name> --namespace instana-kafka \ --docker-username=<registry_username> \ --docker-password=<registry_password> \ --docker-server=<internal-image-registry>:<internal-image-registry-port> --docker-email=<registry_email>
-
Install the Strimzi Operator.
helm install strimzi-kafka-operator strimzi-kafka-operator-0.38.0.tgz --version 0.38.0 -n instana-kafka --set image.registry=<internal-image-registry> --set image.repository=self-hosted-images/3rd-party/strimzi --set image.name=operator --set image.tag=0.38.0_v0.7.0 --set kafka.image.registry=<internal-image-registry> --set kafka.image.repository=self-hosted-images/3rd-party/strimzi --set kafka.image.name=kafka --set kafka.image.tag=3.6.0_v0.7.0
-
Create a YAML file, for example
kafka.yaml
, with the Kafka configuration.apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: instana labels: strimzi.io/cluster: instana spec: kafka: version: 3.6.0 replicas: 3 listeners: - name: scram port: 9092 type: internal tls: false authentication: type: scram-sha-512 configuration: useServiceDnsDomain: true authorization: type: simple superUsers: - strimzi-kafka-user storage: type: jbod volumes: - id: 0 type: persistent-claim size: 200Gi deleteClaim: true zookeeper: replicas: 3 storage: type: persistent-claim size: 50Gi deleteClaim: true entityOperator: template: pod: tmpDirSizeLimit: 10Mi userOperator: image: <internal-image-registry>/self-hosted-images/3rd-party/strimzi/operator:0.38.0_v0.7.0
-
Complete the steps in Deploying and verifying Kafka (online and offline).
Deploying and verifying Kafka (online and offline)
Complete these steps to deploy the Kafka instance and create the data store.
-
Deploy Kafka.
kubectl apply -f kafka.yaml -n instana-kafka kubectl wait kafka/instana --for=condition=Ready --timeout=300s -n instana-kafka
-
Create a Kafka user who can authenticate with a Kafka cluster by using SASL/SCRAM.
-
Create a YAML file, for example
strimzi-kafka-user.yaml
, with the Kafka user specification.apiVersion: kafka.strimzi.io/v1beta2 kind: KafkaUser metadata: name: strimzi-kafka-user labels: strimzi.io/cluster: instana spec: authentication: type: scram-sha-512 authorization: type: simple acls: - resource: type: topic name: '*' patternType: literal operation: All host: "*" - resource: type: group name: '*' patternType: literal operation: All host: "*"
-
Create the
strimzi-kafka-user
user.kubectl apply -f strimzi-kafka-user.yaml -n instana-kafka kubectl wait kafkauser/strimzi-kafka-user --for=condition=Ready --timeout=60s -n instana-kafka
-
Retrieve the password of the
strimzi-kafka-user
.kubectl get secret strimzi-kafka-user -n instana-kafka --template='{{index .data.password | base64decode}}' && echo
-
Store the retrieved password in the
config.yaml
file. Replace<RETRIEVED_FROM_SECRET>
value with the password that you get in the previous step.datastoreConfigs: ... kafkaConfig: adminUser: strimzi-kafka-user adminPassword: <RETRIEVED_FROM_SECRET> consumerUser: strimzi-kafka-user consumerPassword: <RETRIEVED_FROM_SECRET> producerUser: strimzi-kafka-user producerPassword: <RETRIEVED_FROM_SECRET> ...
-
-
Verify the Instana Kafka installation.
kubectl get all -n instana-kafka
The output might be as the following example.
NAME READY STATUS RESTARTS AGE pod/instana-entity-operator-8564975588-mrz2f 3/3 Running 0 15m pod/instana-kafka-0 1/1 Running 0 16m pod/instana-kafka-1 1/1 Running 0 16m pod/instana-kafka-2 1/1 Running 0 16m pod/instana-zookeeper-0 1/1 Running 0 17m pod/instana-zookeeper-1 1/1 Running 0 17m pod/instana-zookeeper-2 1/1 Running 0 17m pod/strimzi-cluster-operator-cf49c75d8-qm8jq 1/1 Running 0 34m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/instana-kafka-bootstrap ClusterIP 192.168.1.239 <none> 9091/TCP,9092/TCP 16m service/instana-kafka-brokers ClusterIP None <none> 9090/TCP,9091/TCP,9092/TCP 16m service/instana-zookeeper-client ClusterIP 192.168.1.122 <none> 2181/TCP 17m service/instana-zookeeper-nodes ClusterIP None <none> 2181/TCP,2888/TCP,3888/TCP 17m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/instana-entity-operator 1/1 1 1 15m deployment.apps/strimzi-cluster-operator 1/1 1 1 34m NAME DESIRED CURRENT READY AGE replicaset.apps/instana-entity-operator-8564975588 1 1 1 15m replicaset.apps/strimzi-cluster-operator-cf49c75d8 1 1 1 34m
Post-installation tasks for Kafka (online and offline)
After you successfully install the Kafka operator, complete the following tasks if required.
(Optional) Balancing the cluster by using cruise control
Cruise control balances workloads in a Kafka cluster. For more information, see Cluster balancing with Cruise Control.
To use cruise control in your Kafka deployment, complete the following steps:
-
Install a metrics reporter in your cluster’s brokers. For more information, see Monitoring Kafka.
-
Create a Cruise Control server deployment.
-
Open the
Kafka
custom resource for editing.kubectl edit kafkas.kafka.strimzi.io <your-kafka-cluster-name> -n instana-kafka
-
Add the following configuration to your
Kafka
custom resource.apiVersion: kafka.strimzi.io/v1beta2 kind: Kafka metadata: name: instana spec: # ... cruiseControl: {}
-
-
Apply the configuration.
kubectl apply -f kafka.yaml -n instana-kafka
When you apply the configuration, Kafka uses all the default Cruise Control settings to deploy the Cruise Control server. If you add Cruise Control to an existing cluster (instead of starting a fresh cluster), then you see that all your Kafka broker pods roll to install the metrics reporter.
-
Verify whether the Cruise Control is deployed in your cluster.
kubectl get pods -n instana-kafka
The command output shows the cruise control pod as shown in the following example:
NAME READY STATUS RESTARTS AGE instana-cruise-control-6fcbd54b48-ndzrh 2/2 Running 0 8m48s instana-entity-operator-78d7995695-8tt2v 3/3 Running 0 12m instana-kafka-0 2/2 Running 0 9m12s instana-kafka-1 2/2 Running 0 10m instana-kafka-2 2/2 Running 0 9m46s instana-zookeeper-0 1/1 Running 0 13m instana-zookeeper-1 1/1 Running 0 13m instana-zookeeper-2 1/1 Running 0 13m strimzi-cluster-operator-54565f8c56-rmdb8 1/1 Running 0 14m
After the Cruise Control is deployed, it starts to pull metrics from the topics that the metrics reporters create in each broker.
Getting an optimization proposal
After Cruise Control is deployed, you can generate an optimization proposal. To do this, you need to apply a KafkaRebalance
custom resource (the definition for this is installed when you install Strimzi). A basic KafkaRebalance
is shown in the following example:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaRebalance
metadata:
name: my-rebalance
labels:
strimzi.io/cluster: instana
# no goals specified, using the default goals from the Cruise Control configuration
spec: {}
You can deploy the KafkaRebalance
custom resource like any other resource.
kubectl apply -f kafka-rebalance.yaml -n instana-kafka
When you apply the KafkaRebalance
resource to your Kafka cluster, the Cluster operator issues the relevant requests to Cruise Control to fetch the optimization proposal. The Cluster operator stores the status of your KafkaRebalance
and the progress updates from Cruise Control. You can check the updates by using the following command:
kubectl describe kafkarebalance my-rebalance -n instana-kafka
The result might be as the following example:
Name: my-rebalance
Namespace: instana-kafka
Labels: strimzi.io/cluster=instana
Annotations: API Version: kafka.strimzi.io/v1beta2
Kind: KafkaRebalance
Metadata:
# ...
Status:
Conditions:
Last Transition Time: 2023-04-04T11:13:35.749066535Z
Status: ProposalReady
Type: State
Observed Generation: 1
Optimization Result:
Data To Move MB: 0
Excluded Brokers For Leadership:
Excluded Brokers For Replica Move:
Excluded Topics:
Intra Broker Data To Move MB: 12
Monitored Partitions Percentage: 100
Num Intra Broker Replica Movements: 0
Num Leader Movements: 24
Num Replica Movements: 55
On Demand Balancedness Score After: 84.23162054159118
On Demand Balancedness Score Before: 80.87946050436929
Recent Windows: 5
Session Id: 42f7cc24-b295-4406-80c9-c3c7a3625180
The following items are the key parts of the KafkaRebalance
request:
-
The
Status.Conditions.Status
field shows the progress of your rebalance request. If the proposal is not ready yet, then you might seeProposalPending
as theStatus
. -
The
Status.Optimization Result
(once the proposal is ready) shows a summary of the optimization proposal. The meaning of each of these values is described in the Strimzi documentation.
Starting a cluster rebalance
If the KafkaRebalance
proposal matches your requirement, then you can apply an approve
annotation to the KafkaRebalance
resource for Cruise Control to start applying the changes:
kubectl annotate kafkarebalance my-rebalance strimzi.io/rebalance=approve -n instana-kafka
When you apply an approve
annotation, Cruise Control updates the changes from the KafkaRebalance
proposal. The Cluster operator keeps the status of the KafkaRebalance
resource up to date with the progress
of the rebalance while it is ongoing. You can check the progress of the rebalance by running the following command:
kubectl describe kafkarebalance my-rebalance -n instana-kafka
When the rebalance is in progress, the Status
shows Rebalancing
. When the rebalance is finished, the Status
shows Ready
.
See the following example:
Name: my-rebalance
Namespace: instana-kafka
Labels: strimzi.io/cluster=instana
Annotations: API Version: kafka.strimzi.io/v1beta2
Kind: KafkaRebalance
Metadata:
# ...
Status:
Conditions:
Last Transition Time: 2023-04-04T11:16:06.230479824Z
Status: Rebalancing
Type: State
# ...
Stopping a cluster rebalance
Rebalances can take a long time and might impact the performance of your cluster. Therefore, you might want to stop a rebalance if it causes an issue. To stop an ongoing rebalance, you can apply the stop
annotation to the KafkaRebalance
resource at any time by using the following command:
kubectl annotate kafkarebalance my-rebalance strimzi.io/rebalance=stop -n instana-kafka