Troubleshooting the Instana Distribution of OpenTelemetry Collector
See the solutions for common issues that you might encounter when using the Instana Distribution of OpenTelemetry Collector.
Collector fails to connect
The Collector fails to connect to the Instana backend.
Solution: Verify your network configuration and ensure that the collector can reach the Instana backend. Review firewall rules, proxy settings, and port configurations and verify that the correct Instana key is in use.
Collector fails to connect in Red Hat OpenShift platform with Instana Self-Hosted Custom Edition
If you are using the Red Hat OpenShift platform with Instana Self-Hosted Custom Edition and encounter connection errors or OTLP endpoint failures (as indicated in the otlp-collector pod logs), you must manually create the routes on Red Hat OpenShift. For more information on route creation, see Configuring endpoints on the Instana backend on Red Hat OpenShift. Route creation is not done automatically.
Collector fails to start
The Collector fails to start due to configuration errors.
Solution: Validate your configuration file by using the --config-check flag before starting the collector. For example, ./otelcol --config config.yaml --config-check
Collector service fails to start
The Collector service fails to start after installation.
Solution: Check the service logs for errors and verify that the configuration parameters in config.env are correct.
Collector fails to show up in Instana UI
The Collector is running but not appearing in the Instana UI.
Solution: The Instana UI's entities page lists components based on the entity.type resource attribute. If your collector is not visible, verify that the entity.type attribute is correctly configured in your configuration file. Ensure that the collector is properly connected to the Instana backend, and check for any authentication or connectivity issues.
Example resource attribute configuration:
telemetry:
resource:
entity.type: otel-collector
Supervisor service issues
- Collector restarts repeatedly despite supervisor service is running.
Solution: Check the supervisor logs for errors and verify that the collector configuration is valid.
- Supervisor service fails to start.
Solution: Verify that the supervisor configuration in config.env is correct and check the system logs for any errors.
Log locations for Linux
-
Collector logs: By default, /opt/instana/collector/logs/collector.log.
-
Supervisor logs: By default, /opt/instana/collector/logs/supervisor.log.
Collector cannot access system metrics or log files
The Collector cannot access system metrics or log files.
Solution: Ensure that the collector process has the appropriate permissions. You may need to run it with elevated privileges or add it to the specific groups.
Odd Collector behavior
The Collector logs show abnormal Telemetry data.
Solution: Restart the collector service by using ./instana_collector_service.sh restart in your installation path to clear any potential issues. If the problem persists, check the collector logs for any anomalies.
Self-signed certificate issues in self-hosted environments
Collector cannot connect to Instana backend due to certificate validation failures in self-hosted environments with self-signed certificates.
Solution: Export the certificate from your Instana server and add it to your system's trusted certificate store:
- Export the PEM file from the Instana server.
- Convert the file to a
.crtfile if necessary. - Add the certificate to your system's trusted certificate store. Copy your
.crtfile to the following locations based on your operating system:- RHEL, CentOS, or Fedora: Copy the
.crtfile to/usr/share/pki/ca-trust-source/anchors/location. - Debian or Ubuntu: Copy the
.crtfile to/usr/local/share/ca-certificates/location.
- RHEL, CentOS, or Fedora: Copy the
- Restart the collector service.
Span status issues: HTTP 4xx status codes marked as errors
Issue: Spans with HTTP 4xx status codes (for example, 400 Bad Request) are marked as errors, but these are expected behavior in your application.
Solution: The OpenTelemetry specification allows instrumentations to set span status more precisely based on context. If you want to filter out specific 4xx responses that are not actual errors in your use case, configure the transform processor block in your collector configuration to contain the following settings, and then add to the pipeline.
transform/span_parse:
error_mode: ignore
trace_statements:
- context: span
statements:
- set(status.code, STATUS_CODE_OK) where attributes["http.status_code"] >= 400 and attributes["http.status_code"] < 500
With this configuration, you can set the span status to OK for specific HTTP status codes.