Debugging and troubleshooting
Collect cluster information and debugging logs to troubleshoot issues with Standard Edition.
Self-Hosted Standard Edition 1.10.3 and earlier versions:
- For online (non–air‑gapped) installations, most
stanctllifecycle commands, such asstanctl upwill fail to run. - For air‑gapped installations, the
stanctlcommands continue to work.
Required action: Upgrade stanctl to 1.10.4 or later versions before you perform a lifecycle operation.
stanctl 1.10.3 or earlier versions, any workflow that stops services, such as a stanctl down command, before a backup, cannot complete because the subsequent stanctl up command fails. Upgrade stanctl to 1.10.4 or later versions before you start these steps. Collect information
Create an archive file with information about your cluster. You can use the information in the file to troubleshoot issues, or share the file with the support team.
The archive file collects the following information:
- Container logs
- Resource manifests (in YAML format)
-
stanctllogs - System information that includes memory, CPU, and CPU usage
- Disk mounts and their usage
- Open files (allocated, free, and maximum)
- Backend logs
Use the following command to create the archive file:
stanctl debug
After you run the command, you see the following messages. When you see Done! in the messages, it means that your archive file is ready.
./stanctl debug
⠼ Streaming container logs [26s] ✓
⠸ Gathering resource manifests [27s] ✓
⠋ Gathering stanctl config files [0s] ✓
⠋ Gathering system information [0s] ✓
⠹ Creating tar file [0s] ✓
----------------------
Done!
Debug package -> debug_20231027111737
Compressed debug package -> debug_20231027111737.tar.gz
----------------------
Adjust log level for Instana components
To adjust the level for Instana components, complete following steps:
-
Edit the Core Config file, for example,
$HOME/.stanctl/values/instana-core/custom-values.yaml. -
Configure a component’s log level in the Core or Unit CR. In the following example, the log level is changed to
DEBUGfor thebutlercomponent:componentConfigs: - name: butler env: - name: COMPONENT_LOGLEVEL # Possible values are DEBUG, INFO, WARN, ERROR (not case-sensitive) value: DEBUG -
Apply the custom values by running the following command:
stanctl backend apply -
View the logs by running the following command:
kubectl logs <component name> -n instana-core<component name> is the component name that you want to troubleshoot.
Troubleshoot
Resolve these issues.
Instana agent is not displayed in the UI
After you delete the Instana agent that was configured for remote monitoring and install the Instana agent for self monitoring, the agent might not be displayed on the Instana UI.
The agent might be trying to connect to the remote Instana backend instead of the local Instana backend.
To resolve this issue, install the agent and specify the backend endpoint host and an agent key:
stanctl agent apply --agent-cluster-name <cluster-name> --agent-endpoint-host acceptor.instana-core --agent-endpoint-port 8600 --agent-zone-name <zone-name> --agent-key <agent-key-of-local-backend>
Instana backend becomes non‑functional when the Elasticsearch data disk exceeds 85% usage
Elasticsearch automatically switches its data store to read‑only mode when the disk it uses exceeds 85% usage. This causes the Instana backend to stop functioning. Free up space on the Elasticsearch data disk or increase its capacity to restore normal operations. Note: Other Instana disks do not trigger read‑only behavior at similar usage levels (even above 95%), which can make this issue appear confusing.
- Free up space on the Elasticsearch data disk
- Increase the disk size allocated to Elasticsearch
Instana backend upgrade fails due to corrupt Helm chart installation
The Instana backend upgrade fails after you run the stanctl backend apply command. You might see the following error:
Error: another operation (install/upgrade/rollback) is in progress
In the console.log file, you might see information similar to the following entries:
ts=2025-05-26T12:26:09Z level=INFO msg="upgrading Helm chart" name=instana-core release=instana-core version=1.8.1 namespace=instana-core
ts=2025-05-26T12:26:09Z level=DEBUG msg="preparing upgrade for instana-core"
This issue indicates a corrupt Helm chart installation of the current core chart that you can reset by using the following command:
- Delete the old Helm chart secret from the
instana-corenamespace.kubectl delete secret -n instana-core -l owner=helm - Upgrade the backend.
stanctl up
Host agent cannot connect to the Instana backend on SLES hosts
After you install the host agent on the local host on SUSE Linux Enterprise Server (SLES) 15 SP5 hosts for self monitoring, the agent does not automatically connect to the Instana backend.
You must use the agent external URL to connect to the backend as a remote host.
Use the following command:
stanctl agent apply --agent-endpoint-host agent-acceptor.<base_domain> --agent-endpoint-port 8443
Kafka pods show CrashLoopBackOff status
Kafka pods do not restart after a shutdown of the Instana backend host. You might see a CrashLoopBackOff status of the Kafka pods.
To resolve the issue, restart the Instana backend.
- Shut down the backend.
stanctl down - Start the backend.
stanctl up
After the backend is restarted, check the status of Kafka pods.
kubectl get pods --all-namespaces | grep kafka
The Kafka pod status should show as Running.
Scheduled Synthetic tests are not running after Instana backup and restore
After Instana backend and agent data are restored, the scheduled Synthetic tests are not running.
To resolve this issue, restart the synthetic-pop-controller pod on the cluster where it is installed.
Standard Edition installation on RHEL 9.3 fails
Red Hat® Enterprise Linux® 9.3 uses iptables 1.8.8.
If you are installing Standard Edition on RHEL 9.3, the installation might fail due to iptables 1.8.8.
To work around the issue, upgrade your host to RHEl 9.4, which also upgrades the iptables to version 1.8.10.
Upgrade fails on Standard Edition 1.9.x
When you upgrade Standard Edition 1.9.x to a later version, you might encounter the following error:
Error: installation failed for prerequisite app coredns: Unable to continue with install: ConfigMap "coredns" in namespace "kube-system" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "coredns"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"
To resolve this issue, run the stanctl up command again.
Systemd does not set a default working directory
When stanctl is started by a systemd service, systemd does not set a working directory on its own. If you do not provide a working directory, systemd runs the service from /. This activity can cause stanctl to create files, such as cluster data, .stanctl, or Kubernetes configs in the wrong place (often / or /root/), even if the service uses a non‑root user.
To mitigate the issue, you must add a WorkingDirectory= line to the systemd service to create files in the correct home directory of users. For example, WorkingDirectory=/home/instana.
Unable to update the license
When you run the stanctl license update command, the command might fail with the following error message:
...
no dependency found: 'instana-core'
...
Run the following commands to update the license:
stanctl license download --sales-key=<your-key>
stanctl backend apply
Instana backend upgrade fails due to node disk pressure
An Instana backend installation or upgrade might fail when the node experiences disk pressure.
Symptoms
- The backend installation or upgrade fails.
- Pods remain pending.
- Some workloads show
ContainerStatusUnknown.
Cause
During installation, upgrade, or air‑gapped package import, disk usage increases temporarily as container images and artifacts are processed. If the node runs out of disk space, Kubernetes sets a DiskPressure condition and prevents new pods from starting.
Verification
- Run the following command to check the node condition:
kubectl describe node <node-name>In the Conditions section, check the DiskPressure.
- Run the following command to check disk usage:
df -hVerify whether disk usage is close to or at capacity.
Solution
- Remove unused container images or unnecessary files to free disk space.
- Increase the storage capacity of the node.
Recovery
If workloads remain in the ContainerStatusUnknown state after you recover disk space, reboot the node. After the reboot completes, retry the installation or upgrade.
License is invalid or missing
If the license is invalid or missing, the backend prevents agents from connecting.
When this occurs
- The imported license is invalid.
- The Instana Operator cannot apply the license to the Groundskeeper backend.
How to troubleshoot
- Verify that the Sales Key in the core secret matches the license strings in the unit secret. If they differ, re-download the license using the correct Sales Key.
- Check the Instana Operator logs for license import errors:
kubectl logs -n instana-operator deployment/instana-operator --tail=100 - Check the Groundskeeper backend component, pod status, and logs:
kubectl get pods -n instana-core | grep groundskeeper - If the license still shows an invalid state, contact IBM Support.
Contact support
If you are unable to resolve the issue, contact IBM support. Provide the archive file that you created to the support team.