Filtering OpenTelemetry logs

The OpenTelemetry Collector allows the log data that it collects to be filtered in many different ways. This document provides examples of how to filter logs in certain common scenarios. These features are included only in the OpenTelemetry Collector Contrib repo, so these examples require that version. For more information, see the OpenTelemetry Collector's contrib documentation.

Filtering logs by their contents

The filter processor accepts regular expressions that are applied to the contents of log messages. Any log messages that match the given regular expression are dropped, and never forwarded on to the receiver on the other end.

Examples

All these examples follow a similar pattern. A filter section must be added to the opentelemetry-collector's configuration file, and that filter must be included in the logs/processors pipeline in the same file. If you are using Helm to install the collector, then the configuration goes in the config: section of the values.yaml file on installation.

Excluding log messages that contain a particular substring

Consider a log file that contains a timestamp, a log level, and a message, such as this example:

2024-02-09 13:00:51 ERROR This is a test error message
2024-02-09 13:02:49 ERROR This is a test error message containing a secret

These logs can be matched more generally by using a filelog receiver configuration like this:

receivers:
  filelog/simple:
    include: [ /tmp/foo.log ]
    operators:
      - type: regex_parser
        regex: '^(?P<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}) (?P<sev>[A-Z]*) (?P<msg>.*)$'
        timestamp:
          parse_from: attributes.time
          layout: '%Y-%m-%d %H:%M:%S'
        severity:
          parse_from: attributes.sev

The given regular expression assigns the timestamp and the severity to the named capture groups. The timestamp layout is defined so that the collector understands the format, which allows the correct timestamp from the log message to be used as the timestamp of the record the collector sends to the server. The severity is sent through unchanged.

This receiver must be added to the logs/receivers pipeline:

  pipelines:
    logs:
      receivers:
        - filelog/simple

This configuration reports all correctly formatted messages from the log back to the server.

However, one of the log messages contains a "secret". To exclude any messages that contain secrets, add a filter to the processors section of the config. The filter must contain a regular expression, which matches any string that includes the word "secret". For example:

processors:
  filter/remove_secret:
    error_mode: ignore
    logs:
      log_record:
        - 'IsMatch(body, ".*secret.*")'

Add this filter to the processors pipeline:

  pipelines:
    logs:
      receivers:
        - filelog/simple
      processors:
        - filter/remove_secret

With this configuration in place, only the first message from the example log lines is reported, but the second one, which contains the word "secret", is dropped.

Excluding syslog messages from a particular service

Consider a syslog on a Linux® system. If someone uses the Gnome desktop on this system, it can create noisy logs:

Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 31 with keysym 31 (keycode a).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 32 with keysym 32 (keycode b).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 33 with keysym 33 (keycode c).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 34 with keysym 34 (keycode d).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 35 with keysym 35 (keycode e).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 36 with keysym 36 (keycode f).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 37 with keysym 37 (keycode 10).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 38 with keysym 38 (keycode 11).
Feb  9 09:10:00 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b org.gnome.Shell.desktop[4771]: Window manager warning: Overwriting existing binding of keysym 39 with keysym 39 (keycode 12).
Feb  9 09:10:26 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b systemd[1]: fprintd.service: Succeeded.

As with the first example, the filelog receiver can monitor this log and map the timestamp by using a regular expression:

receivers:
  filelog/syslog:
    include: [ /var/log/syslog ]
    operators:
      - type: regex_parser
        regex: '^(?P<time>[A-Za-z]{3}[ ]+\d{1,2} \d{2}:\d{2}:\d{2}) (?P<msg>.*)$'
        timestamp:
          parse_from: attributes.time
          layout: '%b %e %H:%M:%S'

The timestamp looks different from the first example, but with the right regular expression and layout definition, it can be understood. This log contains no severity. The receiver, as always, must be included in the logs/receivers pipeline:

  pipelines:
    logs:
      receivers:
        - filelog/simple
        - filelog/syslog

On a server system, one might want to monitor the system processes but exclude logs that the desktop environment generates. A regex filter can block all messages from a particular service:

processors:
  filter/remove_gnomeshell:
    error_mode: ignore
    logs:
      log_record:
        # message body will have the timestamp stripped off by the regex_parser, so it looks like:
        #  "hostname service[pid]: message"
        # Regex matches this as:
        #   [^ ]+                       <hostname> One or more non-whitespace characters, followed by
        #                               A space, followed by
        #   org.gnome.Shell.desktop     <service name> followed by
        #   [                           followed by
        #   [0-9]+                      <pid> one or more numeric digits, followed by
        #   ]:                          followed by
        #   .*                          <message> the rest of the log message
        - 'IsMatch(body, "[^ ]+ org.gnome.Shell.desktop\\[[0-9]+\\]:.*")'

Add the filter to the logs/processors pipeline:

    logs:
      receivers:
        - filelog/simple
        - filelog/syslog
      processors:
        - filter/remove_secret
        - filter/remove_gnomeshell

Now, any log records where the service field is org.gnome.Shell.desktop are dropped.

Filtering logs by infrastructure data

The filter processor is able to filter based on information supplied in the "resource attributes" section of the payload. This processor makes it possible to exclude all messages from, for example, a particular Kubernetes pod, or an entire Kubernetes namespace, or a particular host. Some examples include:

  • Kubernetes
    • k8s.pod.name: The name of the Kubernetes pod that generated the log message
    • k8s.container.name: The name of the underlying container that generated the log message
    • k8s.namespace.name: The name of the namespace that contains the pod that generated the log message
    • k8s-app: The name of the Kubernetes application that generated the log message
    • k8s.deployment.name: The name of the Kubernetes deployment object that controls the pod that generated the log message
    • k8s.node.name: The name of the Kubernetes node where the pod that generated the log message runs
  • Host
    • host.name: The name of the host where the process that generated the log message is running. If the collector is running inside a container, the hostname is typically the container name, and not the name of the actual underlying host.
    • os.type: The operating system type of the host where the process that generated the log message is running

Examples

All these examples follow a similar pattern. A filter section must be added to the opentelemetry-collector's configuration file, and that filter must be included in the logs/processors pipeline in the same file. If you are using Helm to install the collector, then the configuration goes in the config: section of the values.yaml file on installation.

Excluding log messages from a particular Kubernetes container

Suppose you want to filter out all log messages from a Kubernetes container called calico-node. In the processors section of the config, add a block like this:

.    filter/remove_calico:
      error_mode: ignore
      logs:
        log_record:
          - resource.attributes["k8s.container.name"] == "calico-node"

Then in the service/pipelines/logs/processors section, include the new filter:

.        processors:
          - resourcedetection
          - transform/severity_parse
          - batch
          - filter/remove_calico

The new filter is listed last. The other processors that are shown are just for context, and are not needed for the filter processor to work.

Excluding logs messages from all Linux® systems

If you want to block all log messages from any Linux®-based system, then in the 'processors section of the config you can add:

    filter/remove_linux:
      error_mode: ignore
      logs:
        log_record:
          - resource.attributes["os.type"] == "linux"

Likewise, in the service/pipelines/logs/processors section, building on the previous example, include the new filter:

        processors:
          - resourcedetection
          - transform/severity_parse
          - batch
          - filter/remove_calico
          - filter/remove_linux

Redacting sensitive information from logs

Log messages sometimes contain Personally Identifiable Information (PII) or other sensitive data that needs to be kept private and not be sent to the server or saved. This information might include things like passwords, credit card numbers, or any number of other things. The transform processor can detect such information by using a regular expression and replace it with something else.

Examples

These examples follow a similar pattern. A transform section must be added to the opentelemetry-collector's configuration file, and that transform must be included in the logs/processors pipeline in the same file. If you are using Helm to install the collector, then the configuration goes in the config: section of the values.yaml file on installation.

Removing passwords from log messages

Suppose that an application logs the supplied password whenever an authentication failure occurs. The log record might look like this:

2024-02-14 19:40:31 WARNING failed login for user bob, password=bobo

It is important to know that a failed login attempt occurred, but it is inappropriate to log the password that was used.

The application logs the password with a known format. The pattern is always password=xyz. A regular expression can detect that pattern and replace it with something else:

  transform/redact_password:
    log_statements:
      - context: log
        statements:
          # Any log messages containing "password=xxx" or "passwd=yyy" will be matched.
          # Regex matches these as:
          #   passw                     The literal string 'passw', followed by
          #   (?:or)??                  The literal string "or" 0 or 1 times (this allows either password or passwd)
          #   d=                        The literal string "d=", followed by
          #   [^\s]*                    Any non-whitespace characters (the password), followed by
          #   (\s?)*                    0 or more whitespace characters, marking the end of the password.
          - replace_pattern(body, "passw(?:or)??d\\=[^\\s]*(\\s?)", "password=REDACTED")
          - replace_pattern(attributes["msg"], "passw(?:or)??d\\=[^\\s]*(\\s?)", "password=REDACTED")

The replace_pattern directive occurs once for the message body and once for the msg attribute. The open telemetry collector puts the contents of the message in both places so both need to be updated.

And add the transform to the logs/processors pipeline:

    logs:
      receivers:
        - otlp
        - filelog/simple
      processors:
        - transform/redact_password

The transform turns the initial log message into:

2024-02-15 15:45:37 WARNING failed login for user bob, password=REDACTED

Removing hostnames from log messages

Suppose that another application writes hostnames into the syslog:

Feb 15 15:52:36 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b service[45568]: This is a test error message from where.ever.ibm.com

If the hostnames are considered confidential, it is possible to block them from being added to the log. If all the hosts are in the same domain, in this example .ibm.com, then a regular expression can recognize the pattern and obscure the names:

  transform/remove_hostnames:
    log_statements:
      - context: log
        statements:
          # Any log message containing a hostname ending in ".ibm.com" will have the hostname removed.
          # Regex matches as:
          #   ([a-zA-Z0-9-_\.]+)       One or more letters, digits, dashes, underscores, or dots, followed by
          #   \.ibm\.com              The literal string ".ibm.com"
          - replace_pattern(body, "([a-zA-Z0-9-_\\.]+)\\.ibm\\.com", "<hidden hostname>")
          - replace_pattern(attributes["msg"], "([a-zA-Z0-9-_\\.]+)\\.ibm\\.com", "<hidden hostname>")

Again, both the body and the msg attribute are rewritten since the log message text occurs in both places.

And again, add the transform to the pipeline:

    logs:
      receivers:
        - filelog/syslog
      processors:
        - transform/redact_password
        - transform/remove_hostnames

The message is then rewritten as follows:

Feb 15 15:52:36 li-8dc514cc-2e0d-11b2-a85c-f1d7ce42b83b service[45568]: This is a test error message from <hidden hostname>