Technical Blog Post
Abstract
Log Analysis - Basic Kafka Troubleshooting
Body
Kafka is an important component within the new Log Analysis' scalable data collection architecture.
However, for those that are unfamiliar with it, it can be a challenge to troubleshoot it, especially when it comes to the question "whether the data is loaded into Kafka".
If you are interested in getting yourself familiarized with the technology, you can always go to its official website.
However, if you just want to have a quick check on whether data from the Logstash Receiver is loaded into Kafka, this blog entry should be able to fulfill your needs.
***
First of all, let's inspect what's available in the "bin" directory after you installed Kafka:
[danielyeap@kafka1 bin]$ pwd
/opt/kafka_2.11-0.10.0.0/bin
l[danielyeap@kafka1 bin]$ ls -l
total 112
-rwxr-xr-x 1 danielyeap danielyeap 1052 May 18 2016 connect-distributed.sh
-rwxr-xr-x 1 danielyeap danielyeap 1051 May 18 2016 connect-standalone.sh
-rwxr-xr-x 1 danielyeap danielyeap 861 May 18 2016 kafka-acls.sh
-rwxr-xr-x 1 danielyeap danielyeap 864 May 18 2016 kafka-configs.sh
-rwxr-xr-x 1 danielyeap danielyeap 945 May 18 2016 kafka-console-consumer.sh
-rwxr-xr-x 1 danielyeap danielyeap 944 May 18 2016 kafka-console-producer.sh
-rwxr-xr-x 1 danielyeap danielyeap 871 May 18 2016 kafka-consumer-groups.sh
-rwxr-xr-x 1 danielyeap danielyeap 872 May 18 2016 kafka-consumer-offset-checker.sh
-rwxr-xr-x 1 danielyeap danielyeap 948 May 18 2016 kafka-consumer-perf-test.sh
-rwxr-xr-x 1 danielyeap danielyeap 862 May 18 2016 kafka-mirror-maker.sh
-rwxr-xr-x 1 danielyeap danielyeap 886 May 18 2016 kafka-preferred-replica-election.sh
-rwxr-xr-x 1 danielyeap danielyeap 959 May 18 2016 kafka-producer-perf-test.sh
-rwxr-xr-x 1 danielyeap danielyeap 874 May 18 2016 kafka-reassign-partitions.sh
-rwxr-xr-x 1 danielyeap danielyeap 868 May 18 2016 kafka-replay-log-producer.sh
-rwxr-xr-x 1 danielyeap danielyeap 874 May 18 2016 kafka-replica-verification.sh
-rwxr-xr-x 1 danielyeap danielyeap 6358 May 18 2016 kafka-run-class.sh
-rwxr-xr-x 1 danielyeap danielyeap 1364 May 18 2016 kafka-server-start.sh
-rwxr-xr-x 1 danielyeap danielyeap 986 Mar 25 13:42 kafka-server-stop.sh
-rwxr-xr-x 1 danielyeap danielyeap 870 May 18 2016 kafka-simple-consumer-shell.sh
-rwxr-xr-x 1 danielyeap danielyeap 863 May 18 2016 kafka-topics.sh
-rwxr-xr-x 1 danielyeap danielyeap 958 May 18 2016 kafka-verifiable-consumer.sh
-rwxr-xr-x 1 danielyeap danielyeap 958 May 18 2016 kafka-verifiable-producer.sh
drwxr-xr-x 2 danielyeap danielyeap 4096 May 18 2016 windows
-rwxr-xr-x 1 danielyeap danielyeap 867 May 18 2016 zookeeper-security-migration.sh
-rwxr-xr-x 1 danielyeap danielyeap 1381 May 18 2016 zookeeper-server-start.sh
-rwxr-xr-x 1 danielyeap danielyeap 978 May 18 2016 zookeeper-server-stop.sh
-rwxr-xr-x 1 danielyeap danielyeap 968 May 18 2016 zookeeper-shell.sh
[danielyeap@kafka1 bin]$
There are 3 important tools provided:
(1) zookeeper-shell.sh
(2) kafka-topics.sh
(3) kafka-console-consumer.sh
Now, let's explore the tools...
(1) You can use the "zookeeper-shell.sh" to check whether all the Kafka nodes are running properly.
[danielyeap@kafka1 bin]$ ./zookeeper-shell.sh kafka1:2181
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
Connecting to kafka1:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
ls /
[controller_epoch, controller, brokers, zookeeper, admin, isr_change_notification, consumers, config]
ls /brokers
[ids, topics, seqid]
ls /brokers/ids
[0, 1]
The IDs returned by the command "ls /brokers/ids" are the IDs representing the Kafka brokers that are registered with Zookeeper:
Eg.
[danielyeap@kafka1 config]$ pwd
/opt/kafka_2.11-0.10.0.0/config
[danielyeap@kafka1 config]$ grep -i broker.id server_0.properties
broker.id=0
[danielyeap@kafka1 config]$
[danielyeap@kafka2 config]$ pwd
/opt/kafka_2.11-0.10.0.0/config
[danielyeap@kafka2 config]$ grep -i broker.id server_1.properties
broker.id=1
[danielyeap@kafka2 config]$
(2) Since you are sure that all your Kafka brokers are running properly, let's check whether the topic you created is available.
[danielyeap@kafka1 bin]$ pwd
/opt/kafka_2.11-0.10.0.0/bin
[danielyeap@kafka1 bin]$ ./kafka-topics.sh --zookeeper kafka1:2181 --list
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
test
[danielyeap@kafka1 bin]$
Another method to verify whether the topic is available is to go through "zookeeper-shell.sh":
[danielyeap@kafka1 bin]$ pwd
/opt/kafka_2.11-0.10.0.0/bin
[danielyeap@kafka1 bin]$ ./zookeeper-shell.sh kafka1:2181
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
Connecting to kafka1:2181
Welcome to ZooKeeper!
JLine support is disabled
WATCHER::
WatchedEvent state:SyncConnected type:None path:null
ls /brokers/topics
[test]
(3) Lastly, let's check whether there is any data in the topic.
[danielyeap@kafka1 bin]$ pwd
/opt/kafka_2.11-0.10.0.0/bin
[danielyeap@kafka1 bin]$ ./kafka-console-consumer.sh --topic test --from-beginning --zookeeper kafka1:2181
OpenJDK 64-Bit Server VM warning: If the number of processors is expected to increase from one, then you should configure the number of parallel GC threads appropriately using -XX:ParallelGCThreads=N
...
{"message":"2017-06-24T21:50:01.747018+08:00 host=rec1, relayHost=rec1, tag=systemd:, programName=systemd, procid=-, facility=daemon, sev=info, appName=systemd, msg=Starting Session 318 of user root.","@version":"1","@timestamp":"2017-06-24T13:50:01.870Z","path":"/var/log/syslog-la.log","host":"rec1.ibmtest.com","type":"syslog","datasource":"test","resourceID":"%{app-name}_1"}
{"message":"2017-06-24T21:50:01.760178+08:00 host=rec1, relayHost=rec1, tag=CROND[69846]:, programName=CROND, procid=69846, facility=cron, sev=info, appName=CROND, msg=(root) CMD (/usr/lib64/sa/sa1 1 1)","@version":"1","@timestamp":"2017-06-24T13:50:01.870Z","path":"/var/log/syslog-la.log","host":"rec1.ibmtest.com","type":"syslog","datasource":"test","resourceID":"%{app-name}_1"}
{"message":"2017-06-24T21:50:01.807267+08:00 host=rec1, relayHost=rec1, tag=systemd:, programName=systemd, procid=-, facility=daemon, sev=info, appName=systemd, msg=Removed slice user-0.slice.","@version":"1","@timestamp":"2017-06-24T13:50:01.870Z","path":"/var/log/syslog-la.log","host":"rec1.ibmtest.com","type":"syslog","datasource":"test","resourceID":"%{app-name}_1"}
{"message":"2017-06-24T21:50:01.811853+08:00 host=rec1, relayHost=rec1, tag=systemd:, programName=systemd, procid=-, facility=daemon, sev=info, appName=systemd, msg=Stopping user-0.slice.","@version":"1","@timestamp":"2017-06-24T13:50:01.870Z","path":"/var/log/syslog-la.log","host":"rec1.ibmtest.com","type":"syslog","datasource":"test","resourceID":"%{app-name}_1"}
...
If the command returns data, it means there is data in the topic. Otherwise, you will have to check the logs of Logstash Receiver to determine whether it really did send data to Kafka.
Hope that helps!
Subscribe and follow us for all the latest information directly on your social feeds:
|
|
|
Check out all our other posts and updates: | |
Academy Blogs: | https://goo.gl/eZjStB |
Academy Videos: | https://goo.gl/kJeFZE |
Academy Google+: | https://goo.gl/HnTs0w |
Academy Twitter : | https://goo.gl/DiJbvD |
UID
ibm11081503