Best practices for Instana OpenTelemetry logging

To apply general log ingestion for the OpenTelemetry (OTEL) Collector, use the following best practices based on known capabilities and limitations.

All the best practices and recommendations are tested by using the OpenTelemetry Collector contrib package. Because open source projects change quickly, always use the latest stable release for the newest features and bug fixes.

Application logs

The value of any log monitoring and analytics platform depends on the quality of the logs that are ingested. Therefore, the structure of logs can heavily affect the usefulness of log messages in diagnosing issues.

Use a well-defined logging semantic that provides a consistent format for timestamps, log levels, error codes, and contextual information. This approach helps in providing deeper insights into application behavior and supports root cause analysis.

Standard logging facilities can provide structured logging. Examples include various Java logging frameworks that use the Simple Logging Facade for Java (SLF4J) API and the slog package in Golang that supports key value structured logging.

Log only essential information and avoid generating excessive logs that add no value to applications. Too many logs can lead to performance impacts, increase memory and network pressure, and cause unnecessary retention and analysis of log data by Instana.

OpenTelemetry Collector log collection

Before you install and configure the OTEL Collector, get familiar with the following OTEL Collector documentation reported by the OTEL community:

Once you understand OpenTelemetry logging, get familiar with the built-in receiver, processor, and exporter plug-ins. Also, explore the diverse third-party receiver, processor, and exporter plug-ins. Each plug-in has its own set of best practices and usage guidelines on their capabilities and limitations.

Attribute extraction

The OTEL Collector supports general log attributes that capture basic log-related information such as log.file.name and log.iostream.

To enrich log contexts, add Resource and Log Record attributes. These attributes belong to one of two contexts:

  • Resource Log context: contains metadata about the log, such as the hostname.
  • Log Record context: contains specific data for each log entry, such as the log file path.

Use the resource and attributes processors to add hardcoded fields. Use the transform processor to set dynamic attributes with variable values through a configuration-based approach.

If you are not sure whether any particular attributes must be added or modified, avoid adding or modifying attributes. Adding unnecessary attributes can increase the processing time for each log message in the OTEL Collector.

Log message correlation

Instana uses sufficient contextual resource and log record attributes to correlate log messages with their emitting entities, generating linkage between the two. Instana follows a best-effort strategy to build the linkage by using the provided attributes in the payload. The most direct linkage uses the Process ID (PID) and the least direct linkage uses the host machine.

Configure the OTEL Collector to send log data directly to the Instana agent that is running on the same host as the OTEL Collector. This setup automatically sets the host of the Instana agent as the host where the monitored application is located, which provides the least direct linkage for the logs.

Do not configure OTEL Collectors from multiple hosts to send logs to the same Instana agent. Doing so can cause incorrect host attribution for the logs. As a result, Instana cannot correlate log messages as expected.

Container and cluster correlation

Containerized application scenarios are directly supported for entity linkage for Kubernetes attributes that are captured by the K8sattributes processor, such as container.id. The container.id attribute is used to match logs with Container entities that are monitored by Instana's native [DockerLog and ContainerD sensors. Similarly, the k8s.pod.uid attribute is used to link logs to their corresponding Kubernetes pod entities that are registered by the Kubernetes sensor. Capturing the PID is not useful in containerized applications. Each container gets its own PID namespace, and within that namespace, processes are assigned with PIDs starting from 1. As a result, container.id provides the most direct linkage for containerized applications.

For more information about clustered environments, see the Cluster monitoring documentation.

Instana supports the following container or cluster correlation fields that can be captured as attributes:

  • container.id
  • k8s.pod.uid
  • k8s.job.uid
  • k8s.cronjob.uid
  • k8s.node.uid
  • service.instance.id

For AWS deployments, the following attributes can be captured:

  • aws.ecs.container.arn
  • faas.id

Exporting log data

An example OTEL logging payload is available in the official OTEL documentation, where attributes are divided into two separate categories: Resource Attributes and Log Record attributes. For more information, see the Attribute extraction section.

Payloads of this kind are generated through HTTP and gRPC exporters. The Instana agent and Instana backend (OTLP-Acceptor) support these input types through the otlphttp and otlp exporters. Sending logs directly to the Instana agent improves support for log message correlation with their emitter entities, such as containers, pods, and hosts. For more information, see the Container and cluster correlation section.

Filtering log payloads

Due to the nature of application log generation, applications can accidentally expose sensitive Personally Identifying Information (PII) data through logs, such as credit card numbers and credentials. To avoid inclusion of sensitive information in the logging payloads that are sent to Instana, use the filter processor. Placing this processor early in the OTEL logging pipeline configuration prevents other pipeline processors from unnecessarily processing of log messages that are dropped and never reach Instana.

Log message batching

The batch processor queues log messages and send them in bulk to Instana. Use the batch processor to reduce network load by reducing the number of outgoing connections that are necessary to transmit data and supports better compression of the outgoing data. Place this processor at the end of the logging processor pipeline. This placement help ensure that processors that handle log record contents, filtering, or sampling operations occur before construction of the batch payload.

The performance of the batch processors depends on the configured batch size. If the batch size is too low, it increases the number of outgoing requests and adds to network traffic. Alternatively, if batch size is too high, it increases memory pressure and risks out-of-memory (OOM) errors.

The default batch processor configurations work well for many client-side scenarios. However, it is recommended to get familiarized with the configuration options to consider scenario-specific memory and network usage impacts, especially for high-throughput scenarios. You can also use the exporterhelper plug-in to fine-tune the performance for your exporters.

Reducing unnecessary log capture

If your application generates redundant or low-value logs and sent to Instana, use the filter processor to remove them before they reach Instana. You can remove entire categories of logs completely or use the probabilisticsampler processor to capture a sampling of generated logs. This action reduces network load, improve ingested log quality by reducing noise, and save on processing and storage costs.

Data compression

Consider payload compression based on your log collection scenario. Not all log collection scenarios benefit significantly from compression rates and compression ratios.

If your OTEL Collector is CPU-bound (CPU is the limiting factor while disk, memory, and network are variable) and runs on a fast network, disable data compression for improved performance. In such cases, compression provides limited benefits. For slower networks, enable compression to reduce the amount of data that needs to be transmitted. Smaller log batch sizes reduce the chances for compression algorithms to identify patterns. As a result, the compression rates drop relative to larger batch payloads and increases the CPU usage, since the compression algorithm works harder but finds less compressible structure.

The compression ratio depends on the log data entropy, which is a measure of the amount of information or randomness in the data. Therefore, logs with "lower entropy" have more repetitions and lead to better compression. Logs with "higher entropy" have fewer patterns and lead to less compression effectively.

For more information about compression options, see HTTP payload compression and gRPC payload compression depending on the preferred data transmission method.

The OTEL Collector supports only the gzip compression algorithm for sending OTEL payloads to Instana.

Data encryption

Always enable encryption in production environments. If the OTEL Collector sends data to a local Instana agent, follow these steps to enable TLS-encrypted communications between the Instana agent and backend.

Memory usage

The memory limiter processor sets the memory usage boundaries for the OTEL Collector. Follow the best practices and make sure to add this processor as the first processor in the pipeline. Also, do not use this processor as a replacement for properly sizing and configuring the OTEL Collector's resource footprint.