IBM Cloud Kubernetes Service Diagnostics and Debug Tool
5 min read
By: Arpad Kun and Cale Rath
Introducing the Diagnostics and Debug Tool for IBM Cloud Kubernetes Service
At the IBM Cloud Kubernetes Service, we thrive on customer feedback because you help us become better every day. While interacting with our customers, it struck us that the support response times and overall time to resolution are way higher than where we want them to be. We deeply understand and appreciate the frustration that comes when something is not working as expected and it is not clear what is wrong, especially when you don’t have visibility or understand each component of the underlying platform and infrastructure. Our goal is to get you faster resolution and turnaround, ideally without even opening a ticket.
With these goals in mind, we are excited to introduce the IBM Cloud Kubernetes Service Diagnostics and Debug Tool (Beta)—a tool that customers can run in order to collect information and identify potential issues with their IBM Cloud-managed Kubernetes cluster.
At this stage, the Diagnostics and Debug Tool is more of an experiment to determine if it is beneficial for you, the customer. Feedback is more than welcome—both positive and negative—because it will help us guide this project in the right direction.
Why is this tool even needed?
With the IBM Cloud Kubernetes Service, your security is paramount. Although we provide a managed Kubernetes service on IBM Cloud infrastructure, we do not access your clusters, your environment, or your applications. It is a fair request when the customers ask IBM to just make it work at all times, but at the same time, the same customers will also ask that IBM employees not have access to my environment.
The continuous improvement of our compliance posture and the journey to increase the number of certifications has taught us that when customers need help, the fastest way to provide it is to use tools that are self-serve and can run any time and that allow the customer to control and see the diagnostic data output that is shared via IBM Cloud support.
How does it work?
The Diagnostics and Debug Tool is installed to the IBM Cloud Kubernetes Service cluster via a helm chart. There is a DaemonSet component that is run on each worker node and a Deployment component that acts as the server side that you can talk to and ask tests to run. The client side is your browser on your laptop. You can reach the server side by running the kubectl proxy …command, which will create a private tunnel between your laptop and the server side running in your cluster. There is no external access available to the tool other than through kubectl proxy.
At any time, the tool can be deleted by simply deleting the helm chart. When deleting the helm chart, all remnants of the tool that were installed will be removed from the cluster in which it was installed.
What does this tool do?
The main purpose of the Diagnostics and Debug Tool is two-fold:
Collect basic information about the cluster status and details, which can be useful to you and allow you to share with IBM Cloud support if necessary. You can review all data that is gathered. The tool will compress this data in a zip file if you wish to send to IBM Cloud support or if you wish to use the data at a later time. This data can then be loaded back into the tool or opened directly in the operating system in which it was downloaded.
Run pre-defined tests that we created to catch and identify issues in an automated way.
Running tasks to gather data about network and Kubernetes and also run some tests.
Examples of collecting information
Gather information about the Kubernetes cluster’s resources: Pods, Deployments, Services, Ingress, Nodes, Events.
Collect information about the present network settings:
Print the routes on the worker nodes
Collect Calico pods, policies, global network policies, profiles, etc.
Display actual iptables rules deployed to the worker nodes
Show the VLAN, Ingress, and other configmaps
Display information about the ALB and related authentication container versions.
Collect VPN status, routes, rules, NAT information, logs, resources, etc. for strongSwan.
Examples of pre-defined tests
Ping all worker nodes and pods. This test pings all known IPs (as well as an external public IP) within a cluster from a Daemonset running on each node in the cluster.
Deploy a basic coffee and tea ingress resource (deploys the complete example in a unique namespace and validates ingress works correctly).
Validate ingress annotation syntax (that kubectl apply just accepts without checking).
Validate whether the VLAN configuration associated with the ALB is matching the environment.
Verify the existence of necessary secrets (accidental deletes happen surprisingly often).
Scan for errors in the ALB logs.
Test the strongSwan VPN connection and transport, check for the ipsec ports availability, etc. These tests will only produce test results if the strongSwan VPN Helm chart is installed.
How can I use it?
Installing the Diagnostics and Debug Tool is very easy. In its current form, it exists in a helm chart that you can install from the official IBM Cloud Helm repository. Please visit the following link to install the IBM Cloud Kubernetes Service Diagnostics and Debug Tool.
We hope this tool will accelerate and simplify the process of finding issues and get you to faster resolution of problems. Please give us feedback and tell us what tests you would like to see in the future.
If you have questions, engage our team via Slack by registering here and join the discussion in the #general channel on our public IBM Cloud Kubernetes Service Slack.