IBM Streams 4.2.1

Operator PacketFileSource

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streamsx.network/op$com.ibm.streamsx.network.source$PacketFileSource.svg

PacketFileSource is an operator for the IBM InfoSphere Streams product that reads prerecorded network packets from 'packet capture (PCAP)' files, parses the network headers, and emits tuples containing packet data. The operator may be configured with one or more output ports, and each port may be configured to emit different tuples, as specified by output filters. The tuples may contain the entire packet, the payload portion of the packet, or individual fields from the network headers, as specified by output attribute assignments.

The PacketFileSource operator expects PCAP files to contain complete ethernet packets, starting with the ethernet header, including all protocol-specific headers and the packet payload.

The PacketFileSource operator selects packets to process with input filters, parses individual fields in the packet's network headers, selects messages to emit as output tuples with output filter expressions, and assigns values to them with output attribute assignment expressions.

Input filters are PCAP filter expressions, as described here:

Output filters and attribute assignments are SPL expressions. They may use any of the built-in SPL functions, and any of these functions, which are specific to the PacketFileSource operator:

The PacketFileSource operator recognizes these types of packet encapsulation, and steps quietly over their headers:

  • Juniper Networks 'jmirror' encapsulation
  • Cisco Systems 'Encapsulated Remote Switch Port Analyzer (ERSPAN)' encapsulation

Files containing complete ethernet packets can be created in PCAP format by a variety of network diagnostic tools, such as the Linux tcpdump command and the Wireshark open-source tools.

The PacketFileSource operator is part of the network toolkit. To use it in an application, include this statement in the SPL source file:

use com.ibm.streamsx.network.source::*;

Dependencies

The PacketFileSource operator depends upon the Linux 'packet capture library (libpcap)'. The library must be installed on the machine where this operator executes. It is available as an installable 'repository package (RPM)' from the 'base' RHEL and CentOS repositories. It can be installed with administrator tools such as 'yum'. This requires root privileges, which can be acquired temporarily with administrator tools such as 'sudo'.

To install libpcap, enter this command at a Linux command prompt:

sudo yum install libpcap-devel

Alternatively, you can download the source code for a newer version of libpcap and build the library yourself. The new library can then be installed in system directories, or used where built without being installed.

To do this, download the distribution package for the latest version of libpcap from this address:

To build libpcap from source code, open a 'terminal' window and type this at a command prompt:

cd .../directory
tar -xvf .../libpcap-X.Y.Z.tar.gz
cd .../directory/libpcap-X.Y.Z
./configure
make

To instruct the Streams compiler (that is, the 'sc' command) to use your version of libpcap instead of the system version, set these environment variables before compiling an application that contains the PacketLiveSource operator:

export STREAMS_ADAPTERS_LIBPCAP_INCLUDEPATH=.../directory/libpcap-X.Y.Z
export STREAMS_ADAPTERS_LIBPCAP_LIBPATH=.../directory/libpcap-X.Y.Z

For more information on configuring, building, and installing libpcap, refer to its 'INSTALL.txt' file.

This operator has been tested with these versions of libpcap:

  • libpcap 0.9.4, included in RHEL/CentOS 5.x
  • libpcap 0.9.8, included in SLES 11
  • libpcap 1.0.0, included in RHEL/CentOS 6.2
  • libpcap 1.4.0, included in RHEL/CentOS 6.5
  • libpcap 1.5.3, included in RHEL/CentOS 7.1
  • libpcap 1.6.1
  • libpcap 1.6.2
  • libpcap 1.7.4
  • libpcap 1.8.1

Threads

The PacketFileOperator contains a separate thread for reporting its metrics to the Streams runtime if the metricsInterval parameter is greater than zero.

When the PacketFileSource operator is configured without an input port, it contains another thread which reads the PCAP file specified by the pcapFilename operator. When the operator reaches end-of-file, it terminates the thread and terminates the operator.

Exceptions

The PacketFileSource operator will throw an exception and terminate in these situations:

  • The pcapFilename parameter does not specify a valid PCAP recording.
  • An input tuple's first parameter is not of type rstring, or does not specify a valid PCAP recording.

  • The inputFilter and outputFilters parameters do not specify a valid PCAP filter expression.

Sample Applications

The network toolkit includes several sample applications that illustrate how to use this operator.

References

The ethernet frame format is described here:

The ethernet header and the fields it contains are described here:

The IPv4 header and the fields it contains are described here:

The IPv6 header and the fields it contains are described here:

The UDP header and the fields it contains are described here:

The TCP header and the fields it contains are described here:

The Juniper Networks 'jmirror' encapsulation header is described here:

The Cisco Systems ''Encapsulated Remote Switch Port Analyzer (ERSPAN)' encapsulation headers are described here:

The 'Generic Route Encapsulation (GRE)' headers used by Cisco ERSPAN encapsulation are described here:

The Linux tcpdump command is described here:

The Wireshark tools are described here:

libpcap filter expressions, which are used with the inputFilter parameter, are described here:

The result functions that can be used in boolean expressions for the outputFilters parameter and in output attribute assignment expressions are described here:

Summary

Ports
This operator has 1 input port and 1 or more output ports.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 6 parameters.

Optional: initDelay, inputFilter, metricsInterval, outputFilters, pcapFilename, processorAffinity

Metrics
This operator reports 2 metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)

The PacketFileSource operator has one optional input port.

When the PacketFileSource operator is configured with an input port, the first attribute must be of type rstring, and specifies the pathname of an input PCAP file for the operator to read.

When the PacketFileSource operator is configured without an input port, the pcapFilename parameter specifies the pathname of a single input PCAP file for the operator to read.

The packets in a PCAP file may optionally be filtered as they are read from the file with the inputFilter parameter. The tuples produced by the operator may optionally be filtered with the outputFilter parameter.

Properties

Output Ports

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.
Ports (0...)

The PacketFileSource operator requires one or more output ports:

Each output port will produce one output tuple for each packet read from the PCAP file (and passed by the input filter, if the inputFilter parameter is specified) if the corresponding expression in the outputFilters parameter evaluates true, or if no outputFilters parameter is specified.

Output attributes can be assigned values with any SPL expression that evaluates to the proper type, and the expressions may include any of the PacketFileSource result functions. Output attributes that match input attributes in name and type are copied automatically.

The PacketFileSource operator emits a punctuation marker on each output port when it reaches the end of each input file.

Properties

Parameters

Optional: initDelay, inputFilter, metricsInterval, outputFilters, pcapFilename, processorAffinity

initDelay

This optional parameter takes an expression of type float64 that specifies the number of seconds the operator will wait before it begins to produce tuples.

This parameter is allowed only when the pcapFilename parameter is also specified.

The default value is '0.0'.

Properties

inputFilter

This optional parameter takes an expression of type rstring that specifies which input packets should be processed. The value of this string must be a valid PCAP filter expression, as defined here:

The default value is an empty string, which causes all packets read from the PCAP file to be processed.

Properties

metricsInterval

This optional parameter takes an expression of type float64 that specifies the interval, in seconds, for sending operator metrics to the Streams runtime. If the value is zero or less, the operator will not report metrics to the runtime, and the output assigment functions for libpcap statistics will be zero.

The default value is '10.0'.

Properties

outputFilters

This optional parameter takes a list of SPL expressions that specify which packets should be emitted by the corresponding output port. The number of expressions in the list must match the number of output ports, and each expression must evaluate to a boolean value. The output filter expressions may include any of the PacketFileSource result functions.

The default value of the outputFilters parameter is an empty list, which causes all packets processed to be emitted by all output ports.

Properties

pcapFilename

This parameter takes an expression of type rstring that specifies the pathname of a single input PCAP file for the operator to read.

If the operator is configured without an inport port, this parameter is required; if the operator has an input port, this parameter is not allowed.

Properties

processorAffinity

This optional parameter takes an expression of type uint32 that specifies which processor core the operator's thread will run on. The maximum value is P-1, where P is the number of processors on the machine where the operator will run.

Where the operator runs on a thread of its own, this parameter applies to the operator's thread. This is the situation when the operator does not have an input port (and the pcapFilename parameter is specified). This is also the case when the operator has an input port, and it is configured as a threaded input port, and when the operator has an @parallel annotation.

Where the operator runs on the thread of an upstream operator, this parameter affects the thread of the operator that sends tuples to it. This is the situation when the operator has an input port, and is fused with its upstream operator.

The default is to dispatch the operator's thread on any available processor.

Properties

Metrics

nPacketsProcessedCurrent - Counter

This metric counts number of packets processed by the operator. When an input filter is specified, this includes only packets that pass the filter.

nBytesProcessedCurrent - Counter

This metric counts number of bytes of packet data processed by the operator. When an input filter is specified, this includes only packets that pass the filter.

Libraries

Command: ../../impl/bin/libpcapPath.pl
Include Path: ../../impl/include
No description for library.
Library Name: streams_boost_filesystem
No description for library.
Library Name: streams_boost_system