IBM InfoSphere Streams Version 4.1.1

Operator ASN1Parse

Primitive operator image not displayed. Problem loading file: ../../image/tk$com.ibm.streams.teda/op$com.ibm.streams.teda.parser.binary$ASN1Parse.svg

The ASN1Parse operator parses a binary data stream that contains ASN.1-encoded data, extracts parts of the data, and sends the data as tuples to downstream operators.

For each ASN.1-encoded data structure that is detected in the binary data stream, one or more SPL tuples are generated, based on the configuration of the operator (triggers).

This operator requires a structure definition document, the ASN.1 specification, which describes the ASN.1 data structures in the binary data stream.

The mapping of the ASN.1 fields and containers to the SPL output port attributes is based on names and supported types and is not configurable. For more information, see the provided samples.

The ASN1Parse operator provides following features:

  • One root ASN.1 structure type (also called protocol data unit, or PDU)
  • The ability to send zero, one, or multiple tuples for each PDU, depending on the configured triggers
  • Support of the following ASN.1 constructs:
    • Containers
      • CHOICE, SET, and SEQUENCE
    • Primitive types
      • BIT STRING, BMPString, BOOLEAN, ENUMERATED, GeneralizedTime, GeneralString, GraphicString, IA5String, INTEGER, ISO646String, NULL, NumericString, ObjectDescriptor, OBJECT IDENTIFIER, OCTET STRING, PrintableString, REAL, RELATIVE-OID, T61String, TeletexString, UniversalString, UTCTime, UTF8String, VideotexString, and VisibleString
    • Repetition
      • SET OF and SEQUENCE OF
    • Optional
      • OPTIONAL and DEFAULT (limited to the primitive types)
  • Support of basic constraints verification
  • The ability to skip padding bytes between ASN.1 structures
  • Punctuation-based synchronization. If unexpected data is encountered (for example, an incomplete ASN.1 structure because of a lost block of data) which prevents a successful ASN.1 decoding, the ASN1Parse operator assumes a valid ASN.1 structure after the next window punctuation. All data between the point of failure and the next window punctuation is dropped.
  • The ability to send metrics and statistics to an optional output port.
  • Circular ASN.1 definitions (partially supported)

The operator does not support:

  • ASN.1 list or container defaults
  • Multi-dimensional fields
  • Window configurations

The ASN1Parse operator does not throw exceptions or catch any exceptions that might be thrown, for example by a fused downstream operator.

You can use the asn1-data-from-xml command line tool to create ASN.1-encoded test data. See also the ASN1Encode operator.

Behavior in a consistent region and checkpointing

The ASN1Parse operator can be an operator within the reachability graph of a consistent region. It cannot be the start of a consistent region.

The ASN1Parse operator also supports periodic checkpointing that is enabled with the checkpoint configuration clause, for example:

config
	checkpoint : periodic(5.0);
	restartable : true;
Supported ASN.1/SPL Type Combinations
The ASN1Parse operator supports the ASN.1/SPL type combinations that are described in the following list.
Tutorial
This tutorial describes the usage of the ASN1Parse operator by explaining the source code in the teda.sample.ASN1Parse.Assignments sample application. In addition, this tutorial demonstrates how to modify the SPL output schema.

Summary

Ports
This operator has 1 input port and 2 output ports.
Windowing
This operator does not accept any windowing configurations.
Parameters
This operator supports 10 parameters.

Required: structureDocument, trigger

Optional: payloadAttribute, pdu, padding, checkConstraints, metricsMode, metricsModeThreshold, debugCodeGen, debugDecoder

Metrics
This operator reports 13 metrics.

Properties

Implementation
C++
Threading
Always - Operator always provides a single threaded execution context.

Input Ports

Ports (0)

The ASN1Parse operator is configurable with a single input port.

The input port schema must be a tuple with at least one blob attribute, which holds the payload that will be parsed. If more than one blob attribute exists, use the payloadAttribute parameter to specify the attribute that contains the payload.

Window punctuations can change the operator state:

  • If the operator is in failure mode, window punctuation synchronizes the binary data stream.
  • If the metricsMode parameter is either not specified or if it is set to punctuation, the operator is reset and, if the optional second output port exists, a metrics tuple is generated.
Properties

Output Ports

Output Functions
DataAssignmentFunctions
rstring getTrigger()

Returns the trigger that caused the tuple to be sent.

xml getXML()

Returns the decoded ASN.1 structure (relative to the trigger) in XML format.

<any T> T fromInput()

Takes the value from the input tuple. An input attribute of the same name and type as the output attribute must exist.

<any T> T fromInput(T attr)

Takes the value from the attribute of the input tuple of the given name. An input attribute of the given name and same type as the output attribute must exist.

MetricsFunctions
<any T> T fromInput()

Takes the value from the latest input tuple. An input attribute of the same name and type as the output attribute must exist.

<any T> T fromInput(rstring attributeName)

Takes the value from the attribute of the latest input tuple of the given name. An input attribute of the given name and same type as the output attribute must exist.

uint64 getRecordCount(rstring triggerName)

Returns the detected number of records for the requested trigger.

map<rstring,uint64> getRecordCounts()

Returns the detected number of records for all specified triggers.

uint64 getRecordFailureCount(rstring triggerName)

Returns the number of failed conversions for the requested trigger.

map<rstring,uint64> getRecordFailureCounts()

Returns the number of failed conversions for all specified triggers.

map<rstring,map<rstring,uint64>> getRecordStats()

Returns all supported statistics for all specified triggers, for example {"/":{"records":17,"failures":0}}.

list<rstring> getErrors()

Returns the collected error messages, which include the numbers of the corresponding ASN.1 records, for example ["record=7,message=\"ASN.1 field 'X' has value '1234', which cannot be mapped to SPL enum Status, supported values are [1, 2, 4, 8, 99, 1000]\""].

list<rstring> getWarnings()

Returns the collected warning messages, which include the numbers of the corresponding ASN.1 records.

uint64 nRecordsDecodedTotal()

Returns the number of successfully decoded ASN.1 entries.

uint64 nBytesDecodedTotal()

Returns the number of successfully decoded bytes.

uint64 nTuplesReceivedTotal()

Returns the number of received tuples.

uint64 nTuplesSentTotal()

Returns the number of sent tuples.

uint64 nBytesReceivedTotal()

Returns the amount of data received (in bytes).

uint64 nBytesDroppedTotal()

Returns the amount of data dropped (in bytes), for example because of window punctuations.

uint64 nRecordsDecoded()

Returns the number of successfully decoded ASN.1 entries since the last sent metrics tuple.

uint64 nBytesDecoded()

Returns the number of successfully decoded bytes since the last sent metrics tuple.

uint64 nTuplesReceived()

Returns the number of received tuples since the last sent metrics tuple.

uint64 nTuplesSent()

Returns the number of sent tuples since the last sent metrics tuple.

uint64 nBytesReceived()

Returns the amount of received data (in bytes) since the last sent metrics tuple.

uint64 nBytesDropped()

Returns the amount of dropped data (in bytes) since the last sent metrics tuple, for example because of the latest window punctuation.

uint64 latestPunctuation()

Returns the time of the latest occurrence of a window punctuation (in seconds) since the Epoch (00:00:00 UTC, January 1, 1970).

Ports (0)

The output port generates tuples from the binary data stream.

The ASN1Parse operator supports user-specified assignments to output attributes of this output port, using the custom output functions that are listed under DataAssignmentFunctions.

For output attributes without assignment, the operator:

  • Assigns values from the decoded binary data stream
  • Assigns input values, similar to fromInput(), if there are no fitting ASN.1 elements
  • Applies a type-dependent default value if there are no fitting input attributes, for example 0 for integers, false for Boolean.

The ASN1Parse operator supports only automatic assignment to extract data from the ASN.1 record, so you must specify exactly the same tree of containers and fields in SPL. The automatic assignment requires that the SPL attribute names are identical to the ASN.1 field names. The spl-schema-from-asn1 command line tool supports the creation of SPL type definitions.

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes.

Properties

Ports (1)

The output port, if present, generates metric and statistic tuples.

The ASN1Parse operator allows metric-related assignments to output attributes of this optional output port, using the custom output functions that are listed under MetricsFunctions.

Assignments
This operator allows any SPL expression of the correct type to be assigned to output attributes. Attributes not assigned in the output clause will be automatically assigned from the attributes of the input ports that have the same name and type. If there is no such input attribute, an error is reported at compile-time.

Properties

Parameters

This operator supports 10 parameters.
payloadAttribute

Specifies the input port blob attribute, which holds the payload to be parsed. If more than one blob attribute exists in the input port schema, this parameter must exist. If not, the single blob attribute is automatically selected.

Properties

structureDocument

Specifies the path name or names of the structure definition document or documents, the ASN.1 grammar file that describes the ASN.1 data structures in the binary data stream.

The structure definition document is used at SPL compile time. If you modify the document, recompile the SPL application. After the application is recompiled, the structure definition document is not required for job submission.

A relative path is relative to the SPL application directory, which is the current working directory where the sc command is run. For example, if you specify the relative path "etc/StructureDefinition.asn" and run the sc command from the /home/myapp directory, the compiler looks for the StructureDefinition.asn document in the /home/myapp/etc directory.

Properties

pdu

Defines the name of the root ASN.1 structure (PDU) that the parser needs to decode. Typically, the operator determines the PDU. In other words, it selects the PDU, which has no parent structure defined. Sometimes several PDUs exist, for example when the grammar contains unused ASN.1 structures. In this case, this optional parameter becomes mandatory and must be set to the name of the PDU that the parser will decode. Otherwise, the compilation fails, indicating that the PDU is ambiguous.

Properties

padding

Specifies that the operator skips padding bytes. The value that you specify is the padding byte value. For example, if you enter 0, the operator skips 0x00 octets. If you enter 255, the operator skips 0xFF octets.

Properties

trigger

Specifies the data that you are interested in seeing. You might be interested in a complete record or in sub-records, or you might want to get all data in one output tuple or, if there are repeating elements, in many tuples.

You can specify one or more triggers as a comma-separated string list. Each trigger is an absolute path expression that uses the ASN.1 names and a slash ("/") as a delimiter, with the specified or automatically determined root PDU being excluded. For each trigger in the decoded data, a separate tuple is generated. If a trigger is a repeating element, a separate tuple is generated for each instance of this element.

Example

Assuming that you use the following (incomplete) ASN.1 grammar.

Root ::= SEQUENCE
{
	myChoice [0] MyCHOICE,
	myCs     [1] SEQUENCE OF MyC
}
MyCHOICE ::= CHOICE
{
	myA [0] MyA,
	myB [1] MyB
}
MyA ::= SEQUENCE {…}
MyB ::= SEQUENCE {…}
MyC ::= SEQUENCE {…}

The following behaviors are achieved for each trigger.

Example 1: trigger: "/"

Selects the specified or automatically determined root PDU and sends one tuple for each PDU.

Example 2: trigger: "/myChoice/myA"

Generates an output tuple for each myA ASN.1 element in the data. All myB instances are dropped, and instances of Root and myChoice are ignored.

Example 3: trigger: "/myChoice/myA", "/myChoice/myB"

Generates an output tuple for each instance of either myA or myB.

Example 4: trigger: "/myCs"

Generates an output tuple for each MyC instance.

Properties

checkConstraints

Enables or disables ASN.1 constraints verification.

Verified constraints are:

  • Restrictions on the allowed character set for ASN.1 primitive types
  • Constraints in the user-provided ASN.1 grammar (the structure definition document).

By default, this parameter is set to true, verifies these constraints.

Properties

metricsMode

Specifies the trigger mode to send a metrics tuple on the optional second output port. The supported values are punctuation, tuples and bytes. The default value is punctuation. If you specify tuples or bytes, set the metricsModeThreshold parameter. After each sent metrics tuple, a subset of the online metrics is reset.

For example, if punctuation is specified, a metrics tuple is sent on receipt of a window punctuation. If tuples is specified and the metricsModeThreshold parameter is set to 100, a metrics tuple is sent for every 100 input tuples.

Properties

metricsModeThreshold

Specifies the number of received bytes or tuples after which a metrics tuple on the optional second output port is triggered. After each sent metrics tuple, a subset of the online metrics is reset.

For example, if the metricsMode parameter is set to tuples and the metricsModeThreshold is set to 100, a metrics tuple is sent for every 100 input tuples.

This parameter is only allowed if the metricsMode parameter is set to tuples or bytes.

Properties

debugCodeGen

Generates symptoms that you can use to troubleshoot a malfunction during compilation. If this parameter is set to true, it debugs the code generator of the operator, producing a huge amount of compile traces.

Properties

debugDecoder

Generates symptoms that can help you troubleshoot a malfunction during runtime on the ASN.1 decoding level. If you set this parameter to true, it debugs the runtime of the ASN.1 decoder, producing a large amount of console traces and significantly reducing performance.

Properties

Code Templates

ASN1Parse
stream<${Record}> ${Records} as O = ASN1Parse(${inputStream} as I)
{
	param
		payloadAttribute: ${attributeName};
		structureDocument: "${structureDefinitionFile}";
		pdu: "${rootPDU}";
		padding: ${paddingValue}ub;
		trigger: "/${absolutePath}";
}
      

ASN1Parse with metrics
(
	stream<${Record}> ${Records} as O;
	stream<${Metric}> ${Metrics} as M
) as ${ParsedRecords} = ASN1Parse(${inputStream} as I)
{
	param
		payloadAttribute: ${attributeName};
		structureDocument: "${structureDefinitionFile}";
		pdu: "${rootPDU}";
		padding: ${paddingValue}ub;
		trigger: "/${absolutePath}";
	output M:
		${outputExpression};
}
      

ASN1Parse with DirectoryScan and FileSource
stream<rstring filename> ${Filenames} as O = DirectoryScan()
{
	param
		directory: "${inputDirectory}";
		pattern: "${filenamePattern}";
}

stream<rstring filename, int64 tupleNo, blob payload> ${DataBlocks} as O = FileSource(${Filenames} as I)
{
	param
		format: block;
		blockSize: ${blocksize}u;
	output O:
		filename = FileName(),
		tupleNo = TupleNumber();
}

stream<${Record}> ${Records} as O = ASN1Parse(${DataBlocks} as I)
{
	param
		payloadAttribute: ${attributeName};
		structureDocument: "${structureDefinitionFile}";
		pdu: "${rootPDU}";
		padding: ${paddingValue}ub;
		trigger: "/${absolutePath}";
}
      

ASN1Parse with metrics, DirectoryScan, and FileSource
stream<rstring filename> ${Filenames} as O = DirectoryScan()
{
	param
		directory: "${inputDirectory}";
		pattern: "${filenamePattern}";
}

stream<rstring filename, int64 tupleNo, blob payload> ${DataBlocks} as O = FileSource(${Filenames} as I)
{
	param
		format: block;
		blockSize: ${blocksize}u;
	output O:
		filename = FileName(),
		tupleNo = TupleNumber();
}

(
	stream<${Record}> ${Records} as O;
	stream<${Metric}> ${Metrics} as M
) as ${ParsedRecords} = ASN1Parse(${DataBlocks} as I)
{
	param
		payloadAttribute: ${attributeName};
		structureDocument: "${structureDefinitionFile}";
		pdu: "${rootPDU}";
		padding: ${paddingValue}ub;
		trigger: "/${absolutePath}";
	output M:
		${outputExpression};
}
      

Metrics

nRecordsDecodedTotal - Counter

The number of successfully decoded ASN.1 entries.

nBytesDecodedTotal - Counter

The number of successfully decoded bytes.

nTuplesReceivedTotal - Counter

The number of received tuples.

nTuplesSentTotal - Counter

The number of sent tuples.

nBytesReceivedTotal - Counter

The amount of received data (in bytes).

nBytesDroppedTotal - Counter

The amount of dropped data (in bytes), for example because of window punctuations.

nRecordsDecoded - Gauge

The number of successfully decoded ASN.1 entries since the last sent metrics tuple. This value is reset after a metrics tuple is sent.

nBytesDecoded - Gauge

The number of successfully decoded bytes since the last sent metrics tuple. This value is reset after a metrics tuple is sent.

nTuplesReceived - Gauge

The number of received tuples since the last sent metrics tuple. This value is reset after a metrics tuple is sent.

nTuplesSent - Gauge

The number of sent tuples since the last sent metrics tuple. This value is reset after a metrics tuple is sent.

nBytesReceived - Gauge

The amount of received data (in bytes) since the last sent metrics tuple. The value is reset after a metrics tuple is sent.

nBytesDropped - Gauge

The amount of dropped data (in bytes) since the last sent metrics tuple, for example because of the latest window punctuation. The value is reset after a metrics tuple is sent.

latestPunctuation - Time

The time of the latest window punctuation since the Epoch (00:00:00 UTC, January 1, 1970).

Libraries

Common Headers
Include Path: ../../../../impl/include/parser.binary