Troubleshooting the Instana Distribution of OpenTelemetry Collector

See the solutions for common issues that you might encounter when using the Instana Distribution of OpenTelemetry Collector.

Collector fails to connect

The Collector fails to connect to the Instana backend.

Solution: Verify your network configuration and ensure that the collector can reach the Instana backend. Review firewall rules, proxy settings, and port configurations and verify that the correct Instana key is in use.

Collector fails to connect in Red Hat OpenShift platform with Instana Self-Hosted Custom Edition

If you are using the Red Hat OpenShift platform with Instana Self-Hosted Custom Edition and encounter connection errors or OTLP endpoint failures (as indicated in the otlp-collector pod logs), you must manually create the routes on Red Hat OpenShift. For more information on route creation, see Configuring endpoints on the Instana backend on Red Hat OpenShift. Route creation is not done automatically.

Collector fails to start

The Collector fails to start due to configuration errors.

Solution: Validate your configuration file by using the --config-check flag before starting the collector. For example, ./otelcol --config config.yaml --config-check

Collector service fails to start

The Collector service fails to start after installation.

Solution: Check the service logs for errors and verify that the configuration parameters in config.env are correct.

Collector fails to show up in Instana UI

The Collector is running but not appearing in the Instana UI.

Solution: The Instana UI's entities page lists components based on the entity.type resource attribute. If your collector is not visible, verify that the entity.type attribute is correctly configured in your configuration file. Ensure that the collector is properly connected to the Instana backend, and check for any authentication or connectivity issues.

Example resource attribute configuration:

telemetry: 
  resource: 
    entity.type: otel-collector
 

Supervisor service issues

  • Collector restarts repeatedly despite supervisor service is running.

Solution: Check the supervisor logs for errors and verify that the collector configuration is valid.

  • Supervisor service fails to start.

Solution: Verify that the supervisor configuration in config.env is correct and check the system logs for any errors.

Log locations for Linux

  • Collector logs: By default, /opt/instana/collector/logs/collector.log.

  • Supervisor logs: By default, /opt/instana/collector/logs/supervisor.log.

Collector cannot access system metrics or log files

The Collector cannot access system metrics or log files.

Solution: Ensure that the collector process has the appropriate permissions. You may need to run it with elevated privileges or add it to the specific groups.

Odd Collector behavior

The Collector logs show abnormal Telemetry data.

Solution: Restart the collector service by using ./instana_collector_service.sh restart in your installation path to clear any potential issues. If the problem persists, check the collector logs for any anomalies.

Self-signed certificate issues in self-hosted environments

Collector cannot connect to Instana backend due to certificate validation failures in self-hosted environments with self-signed certificates.

Solution: Export the certificate from your Instana server and add it to your system's trusted certificate store:

  1. Export the PEM file from the Instana server.
  2. Convert the file to a .crt file if necessary.
  3. Add the certificate to your system's trusted certificate store. Copy your .crt file to the following locations based on your operating system:
    • RHEL, CentOS, or Fedora: Copy the .crt file to /usr/share/pki/ca-trust-source/anchors/ location.
    • Debian or Ubuntu: Copy the .crt file to /usr/local/share/ca-certificates/ location.
  4. Restart the collector service.

Span status issues: HTTP 4xx status codes marked as errors

Issue: Spans with HTTP 4xx status codes (for example, 400 Bad Request) are marked as errors, but these are expected behavior in your application.

Solution: The OpenTelemetry specification allows instrumentations to set span status more precisely based on context. If you want to filter out specific 4xx responses that are not actual errors in your use case, configure the transform processor block in your collector configuration to contain the following settings, and then add to the pipeline.

  transform/span_parse:
    error_mode: ignore
    trace_statements:
     - context: span
       statements:
         - set(status.code, STATUS_CODE_OK) where attributes["http.status_code"] >= 400 and attributes["http.status_code"] < 500

With this configuration, you can set the span status to OK for specific HTTP status codes.