UDP Source

The UDP Source source reads messages from one or more UDP ports. For information about supported versions, see Supported systems and versions.

To use multiple threads for flow processing, use the UDP Multithreaded Source. For a discussion about the differences between the two sources, see Comparing UDP Source sources.

UDP Source generates a record for every message. UDP Source can process collectd messages, NetFlow 5 and NetFlow 9 messages, and the following types of syslog messages:
  • RFC 5424
  • RFC 3164
  • Non-standard common messages, such as RFC 3339 dates with no version digit

When processing NetFlow messages, the stage generates different records based on the NetFlow version. When processing NetFlow 9, the records are generated based on the NetFlow 9 configuration properties. For more information, see NetFlow data processing.

The source can also read binary or character-based raw data.

When a flow stops, the source notes where it stops reading. When the flow starts again, the source continues processing from where it stopped by default. You can reset the offset to process all requested data.

When you configure UDP Source, you specify the ports to use and the batch size and wait time. When epoll is available, you can specify the number of receiver threads to use to increase the throughput of packets to the flow.

You also specify the data format for the data, then configure any related properties.

Processing raw data

Use the Raw/Separated Data data format to enable the UDP Source source to generate records from binary or character-based raw data.

When processing raw data, the source can generate a record for each UDP packet that it receives. Or, if you specify a separator character, then the source can generate multiple records from each UDP packet.

When generating multiple records, you specify the multiple value behavior: one record with only the first value, one record with all values as a list, or multiple records with one record for each value.

You can optionally specify an output field to use for the data. When not specified, the source writes the raw data to the root field.

You might use the Raw/Separated Data data format to write raw data to a field that you later process using the Data Parser processor. This allows you to retain the raw data for another use.

Receiver threads

Receiver threads are used to pass data from the UDP source system to the source. By default, the source uses a single receiver thread.

You can configure the UDP Source source to use additional receiver threads when Data Collector runs on a machine enabled for epoll. Epoll requires native libraries and is only available when Data Collector runs on recent versions of 64-bit Linux. When you enable multiple receiver threads, you increase the volume of data that can be passed to the source at one time.

To use additional receiver threads, select the Use Native Transports (epoll) property, and then configure Number of Receiver Threads.

Configuring a UDP Source

About this task

Configure a UDP Source source to process messages from a UDP source.

Procedure

  1. In the Properties panel, on the General tab, configure the following properties:
    General Property Description
    Name Stage name.
    Description Optional description.
    On Record Error Error record handling for the stage:
    • Discard - Discards the record.
    • Send to Error - Sends the record to the flow for error handling.
    • Stop Flow - Stops the flow.
  2. On the UDP tab, configure the following properties:
    UDP Property Description
    Port Port to listen to for data. Using simple or bulk edit mode, click the Add icon to list additional ports.

    To listen to a port below 1024, Data Collector must be run by a user with root privileges. Otherwise, the operating system does not allow Data Collector to bind to the port.

    Note: No other flows or processes can already be bound to the listening port. The listening port can be used only by a single flow.
    Data Format Data format passed by UDP:
    • collectd
    • NetFlow
    • syslog
    • Raw/separated data
    Use Native Transports (epoll) Specifies whether to use multiple receiver threads for each port. Using multiple receiver threads can improve performance.

    You can use multiple receiver threads using epoll, which can be available when Data Collector runs on recent versions of 64-bit Linux.

    Number of Receiver Threads Number of receiver threads to use for each port. For example, if you configure two threads per port and configure the source to use three ports, the source uses a total of six threads.

    Use to increase the number of threads passing data to the source when epoll is available on the Data Collector machine.

    Default is 1.

    Max Batch Size (messages) Maximum number of messages to include in a batch and pass through the flow at one time. Honors values up to the Data Collector maximum batch size.

    Default is 1000. The Data Collector default is 1000.

    Batch Wait Time (ms) Milliseconds to wait before sending a partial or empty batch.
  3. On the syslog tab, define the character set for the data.
  4. On the collectd tab, define the following collectd properties:
    collectd Property Properties
    Convert Hi-Res Time & Interval Converts the collectd high resolution time format interval and timestamp to UNIX time, in milliseconds.
    Exclude Interval Excludes the interval field from output record.
    Auth File Path to an optional authentication file. Use an authentication file to accept signed and encrypted data.
    TypesDB File Path Path to a user-provided types.db file. Overrides the default types.db file.
    Charset Character set of the data.
  5. For raw data, on the Raw/Separated Data tab, define the following properties:
    Raw/Separated Data Property Description
    Raw Data Mode Type of raw data to process: binary or string data.
    Output Field Path Optional output field for the raw data. When not used, the source writes the raw data to the root field.
    Multiple Values Behavior
    Action to take when the data in the data separator generates multiple values from a UDP packet:
    • First Value Only - Returns one record with the first value.
    • All Values as a List - Returns one record with all values in a List.
    • Split into Multiple Records - Returns multiple records, one record for each value.
    Data Separator Optional data separator to use to separate UDP packets to multiple values. Specify byte literals using Java Unicode syntax, \u<character code>.

    For example, the default line feed character is expressed as follows: \u000A.

    Charset Charset used by string data.
  6. For NetFlow 9 data, on the NetFlow 9 tab, configure the following properties:
    When processing earlier versions of NetFlow data, these properties are ignored.
    Netflow 9 Property Description
    Record Generation Mode Determines the type of values to include in the record. Select one of the following options:
    • Raw Only
    • Interpreted Only
    • Both Raw and Interpreted
    Max Templates in Cache The maximum number of templates to store in the template cache. For more information about templates, see Caching NetFlow 9 templates.

    Default is -1 for an unlimited cache size.

    Template Cache Timeout (ms) The maximum number of milliseconds to cache an idle template. Templates unused for more than the specified time are evicted from the cache. For more information about templates, see Caching NetFlow 9 templates.

    Default is -1 for caching templates indefinitely.