Debugging and troubleshooting
Collect cluster information and debugging logs to troubleshoot issues with Standard Edition.
Self-Hosted Standard Edition 1.10.3 and earlier versions:
- For online (non–air‑gapped) installations, most
stanctllifecycle commands, such asstanctl upwill fail to run. - For air‑gapped installations, the
stanctlcommands continue to work.
Required action: Upgrade stanctl to 1.10.4 or later versions before you perform a lifecycle operation.
stanctl 1.10.3 or earlier versions, any workflow that stops services, such as a stanctl down command, before a backup, cannot complete because the subsequent stanctl up command fails. Upgrade stanctl to 1.10.4 or later versions before you start these steps. Collect information
Create an archive file with information about your cluster. You can use the information in the file to troubleshoot issues, or share the file with the support team.
The archive file collects the following information:
- Container logs
- Resource manifests (in YAML format)
-
stanctllogs - System information that includes memory, CPU, and CPU usage
- Disk mounts and their usage
- Open files (allocated, free, and maximum)
- Backend logs
Use the following command to create the archive file:
stanctl debug
After you run the command, you see the following messages. When you see Done! in the messages, it means that your archive file is ready.
./stanctl debug
⠼ Streaming container logs [26s] ✓
⠸ Gathering resource manifests [27s] ✓
⠋ Gathering stanctl config files [0s] ✓
⠋ Gathering system information [0s] ✓
⠹ Creating tar file [0s] ✓
----------------------
Done!
Debug package -> debug_20231027111737
Compressed debug package -> debug_20231027111737.tar.gz
----------------------
Adjust log level for Instana components
To adjust the level for Instana components, complete following steps:
-
Edit the Core Config file, for example,
$HOME/.stanctl/values/instana-core/custom-values.yaml. -
Configure a component’s log level in the Core or Unit CR. In the following example, the log level is changed to
DEBUGfor thebutlercomponent:componentConfigs: - name: butler env: - name: COMPONENT_LOGLEVEL # Possible values are DEBUG, INFO, WARN, ERROR (not case-sensitive) value: DEBUG -
Apply the custom values by running the following command:
stanctl backend apply -
View the logs by running the following command:
kubectl logs <component name> -n instana-core<component name> is the component name that you want to troubleshoot.
Secure Sockets Layer (SSL) certificates
Understand supported configurations, limitations, and troubleshooting steps related to SSL certificates.
Wildcard SSL certificates
You can use wildcard SSL certificates with Instana Standard Edition. A few wildcard configurations are unsupported, and specific deployment patterns require more consideration.
Example Scenario
Assume the following DNS structure:
-
Wildcard certificate:
*.company.com -
Instana backend:
instana.company.com -
Agent acceptor endpoint:
agent-acceptor.instana.company.com -
Instana UI:
unit-tenant.instana.company.com
Limitations of single-level wildcards
You cannot use a wildcard certificate, such as *.company.com in this scenario.
Reason: By design, an asterisk (*) can replace only one label in a DNS name. It cannot span multiple subdomain levels.
Certificate matching rules
Certificate *.company.com matches:
www.company.comapi.company.commail.company.com
Certificate *.company.com does not match:
a.b.company.comdev.api.company.comcompany.com
.) separates DNS labels. The wildcard replaces exactly one label only.To support the Instana deployment, use one of the following options:
- Create a wildcard certificate for the full Instana base domain:
*.instana.company.com - Use a SAN certificate that lists all required hostnames.
DNS: api.example.com DNS: dev.api.example.com DNS: prod.api.example.com - Combine multiple wildcard entries within a SAN certificate:
DNS: *.example.com DNS: *.api.example.com
Restore the self-signed certificate
You can restore a previously removed self-signed TLS certificate.
- Delete the existing TLS secret:
kubectl delete secret instana-tls -n instana-core - Generate a new self-signed certificate:
Result:stanctl be apply --core-tls-generate-cert- A new self-signed SSL certificate is generated and applied.
- TLS encryption is enabled for Instana endpoints.
- Modern browsers (such as Chrome or Firefox) display security warnings because a trusted Certificate Authority does not issue the certificate.
Important: Although browsers mark the connection as untrusted, all communication remains encrypted.
Summary
- Single-level wildcard certificates (for example,
*.company.com) do not support multi-level subdomains. - Instana Standard commonly requires SAN certificates or base-domain wildcards.
- You can safely regenerate self-signed certificates when needed, with expected browser warnings.
Troubleshoot
Resolve these issues.
Instana agent is not displayed in the UI
After you delete the Instana agent that was configured for remote monitoring and install the Instana agent for self monitoring, the agent might not be displayed on the Instana UI.
The agent might be trying to connect to the remote Instana backend instead of the local Instana backend.
To resolve this issue, install the agent and specify the backend endpoint host and an agent key:
stanctl agent apply --agent-cluster-name <cluster-name> --agent-endpoint-host acceptor.instana-core --agent-endpoint-port 8600 --agent-zone-name <zone-name> --agent-key <agent-key-of-local-backend>
Instana backend becomes non‑functional when the Elasticsearch data disk exceeds 85% usage
Elasticsearch automatically switches its data store to read‑only mode when the disk it uses exceeds 85% usage. This causes the Instana backend to stop functioning. Free up space on the Elasticsearch data disk or increase its capacity to restore normal operations. Note: Other Instana disks do not trigger read‑only behavior at similar usage levels (even above 95%), which can make this issue appear confusing.
- Free up space on the Elasticsearch data disk
- Increase the disk size allocated to Elasticsearch
Instana backend upgrade fails due to corrupt Helm chart installation
The Instana backend upgrade fails after you run the stanctl backend apply command. You might see the following error:
Error: another operation (install/upgrade/rollback) is in progress
In the console.log file, you might see information similar to the following entries:
ts=2025-05-26T12:26:09Z level=INFO msg="upgrading Helm chart" name=instana-core release=instana-core version=1.8.1 namespace=instana-core
ts=2025-05-26T12:26:09Z level=DEBUG msg="preparing upgrade for instana-core"
This issue indicates a corrupt Helm chart installation of the current core chart that you can reset by using the following command:
- Delete the old Helm chart secret from the
instana-corenamespace.kubectl delete secret -n instana-core -l owner=helm - Upgrade the backend.
stanctl up
Host agent cannot connect to the Instana backend on SLES hosts
After you install the host agent on the local host on SUSE Linux Enterprise Server (SLES) 15 SP5 hosts for self monitoring, the agent does not automatically connect to the Instana backend.
You must use the agent external URL to connect to the backend as a remote host.
Use the following command:
stanctl agent apply --agent-endpoint-host agent-acceptor.<base_domain> --agent-endpoint-port 8443
Kafka pods show CrashLoopBackOff status
Kafka pods do not restart after a shutdown of the Instana backend host. You might see a CrashLoopBackOff status of the Kafka pods.
To resolve the issue, restart the Instana backend.
- Shut down the backend.
stanctl down - Start the backend.
stanctl up
After the backend is restarted, check the status of Kafka pods.
kubectl get pods --all-namespaces | grep kafka
The Kafka pod status should show as Running.
Scheduled Synthetic tests are not running after Instana backup and restore
After Instana backend and agent data are restored, the scheduled Synthetic tests are not running.
To resolve this issue, restart the synthetic-pop-controller pod on the cluster where it is installed.
Standard Edition installation on RHEL 9.3 fails
Red Hat® Enterprise Linux® 9.3 uses iptables 1.8.8.
If you are installing Standard Edition on RHEL 9.3, the installation might fail due to iptables 1.8.8.
To work around the issue, upgrade your host to RHEl 9.4, which also upgrades the iptables to version 1.8.10.
Upgrade fails on Standard Edition 1.9.x
When you upgrade Standard Edition 1.9.x to a later version, you might encounter the following error:
Error: installation failed for prerequisite app coredns: Unable to continue with install: ConfigMap "coredns" in namespace "kube-system" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "coredns"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "kube-system"
To resolve this issue, run the stanctl up command again.
Upgrade fails with "Insufficient CPU" or "Insufficient memory" errors
You might you experience any of the following issues during upgrades on single-node K3s deployments with limited resources or environments where adding temporary capacity is not feasible:
- Errors such as "Insufficient CPU" or "Insufficient memory"
- Pods that remain in a Pending state
The default RollingUpdate strategy requires temporary additional capacity to run both old and new pods simultaneously during the upgrade. On systems with limited resources, this requirement can exceed the available CPU or memory, even when overall utilisation appears low.
stanctl up --core-update-strategy=Recreate
For more information on update strategy, see Configuring update strategy for upgrades
The Recreate strategy causes brief downtime (several minutes) during the upgrade process. This tradeoff is typically acceptable for nonproduction environments or systems where adding hardware capacity is not feasible.
Systemd does not set a default working directory
When stanctl is started by a systemd service, systemd does not set a working directory on its own. If you do not provide a working directory, systemd runs the service from /. This activity can cause stanctl to create files, such as cluster data, .stanctl, or Kubernetes configs in the wrong place (often / or /root/), even if the service uses a non‑root user.
To mitigate the issue, you must add a WorkingDirectory= line to the systemd service to create files in the correct home directory of users. For example, WorkingDirectory=/home/instana.
Unable to update the license
When you run the stanctl license update command, the command might fail with the following error message:
...
no dependency found: 'instana-core'
...
Run the following commands to update the license:
stanctl license download --sales-key=<your-key>
stanctl backend apply
Instana backend upgrade fails due to node disk pressure
An Instana backend installation or upgrade might fail when the node experiences disk pressure.
Symptoms
- The backend installation or upgrade fails.
- Pods remain pending.
- Some workloads show
ContainerStatusUnknown.
Cause
During installation, upgrade, or air‑gapped package import, disk usage increases temporarily as container images and artifacts are processed. If the node runs out of disk space, Kubernetes sets a DiskPressure condition and prevents new pods from starting.
Verification
- Run the following command to check the node condition:
kubectl describe node <node-name>In the Conditions section, check the DiskPressure.
- Run the following command to check disk usage:
df -hVerify whether disk usage is close to or at capacity.
Solution
- Remove unused container images or unnecessary files to free disk space.
- Increase the storage capacity of the node.
Recovery
If workloads remain in the ContainerStatusUnknown state after you recover disk space, reboot the node. After the reboot completes, retry the installation or upgrade.
License is invalid or missing
If the license is invalid or missing, the backend prevents agents from connecting.
When this occurs
- The imported license is invalid.
- The Instana Operator cannot apply the license to the Groundskeeper backend.
How to troubleshoot
- Verify that the Sales Key in the core secret matches the license strings in the unit secret. If they differ, re-download the license using the correct Sales Key.
- Check the Instana Operator logs for license import errors:
kubectl logs -n instana-operator deployment/instana-operator --tail=100 - Check the Groundskeeper backend component, pod status, and logs:
kubectl get pods -n instana-core | grep groundskeeper - If the license still shows an invalid state, contact IBM Support.
Installation fails with “Fatal glibc error: CPU does not support x86‑64‑v2”
Symptom
stanctl install, with an error similar to:
Fatal glibc error: CPU does not support x86-64-v2
This failure typically occurs before all components are installed and can prevent services, such as Cassandra and Kafka, from starting.
Cause
This error indicates that the operating system cannot detect the x86‑64‑v2 instruction set. The most common cause is a virtual machine CPU configuration that does not expose the required CPU flags, often due to legacy compatibility settings.
Resolution
- Avoid legacy CPU compatibility profiles when you configure the virtual machine.
- Make sure that CPU masking is not enabled.
- If Enhanced vMotion Compatibility (EVC) is enabled, verify that it is set to a level that supports x86‑64‑v2 instructions.
- Power off the virtual machine and update the CPU configuration.
- If necessary, recreate the virtual machine with updated CPU settings.
Contact support
If you are unable to resolve the issue, contact IBM support. Provide the archive file that you created to the support team.