IBM InfoSphere Streams Version 4.1.0

Documenting operators and toolkits

When you develop an operator, a composite operator, or a toolkit for third-party use, you must document its functionality, expected inputs, and generated outputs

Operators

Operators can be documented by using the appropriate <description> tags in the operator model. The operator documentation includes the following items:

Description

Describe the function of the operator.

Input ports

Describe each input port of the operator and characterize it according to the following aspects:

If it is required or optional
If it is a control or data port
If it needs a specific stream type, or a particular attribute name and type
Its punctuation behavior (Expecting, Oblivious, or WindowBound)
Its mutability (mutating or non-mutating)

Output ports

Describe each output port of the operator and characterize each port according to the following aspects:

If it is required or optional
If it needs a specific stream type, or a particular attribute name and type
Its punctuation behavior (Free, Generating, or Preserving)
Its mutability (mutating or non-mutating)

Parameters

Describe the name and meaning of each parameter, and characterize each parameter according to the following aspects:

If it is required or optional
Its accepted syntax (for example, an expression, a stream attribute, and a custom literal)
Its accepted data types and value ranges
Its cardinality
If it has any default value

Windowing

For each input port, state whether it allows windowing configurations.

If the port has windows, describe its behavior.
- Window type (tumbling, sliding)
- Eviction policies (count, time, delta, punct)
- Trigger policies (count, time, delta)
- Partitioning (partitioned)
- Partition eviction policies (partitionAge, partitionCount, tupleCount)
Indicate the conditions in which window punctuation is generated.

Output assignment

Describe whether output assignments are allowed and if there any custom output functions available. For each custom output function, document its signature and function.

Metrics

Describe each metric by its name and what it produces.

Exceptions

Describe the conditions in which an operator might terminate (for example, an operator cannot open a file). Exceptions can be documented in the operator <description> tag.

Example usage

Include an example of an operator invocation by using <codeTemplates> and <codeTemplate> tags in the operator model.

Composite operators

Composite operators can be documented by using appropriate SPLDOC tags. The composite operator documentation includes the following items:

Description

Describe the function of the composite operator.

Input ports

Describe each input port of the composite operator and characterize each port similarly to a primitive operator input port.

Output ports

Describe each output port of the composite operator and characterize each port similarly to a primitive operator output port.

Parameters

Describe the name and meaning of each parameter. Characterize each parameter according to the following aspects:

If it is required or optional
Its expression mode (attribute, expression, function, operator, or type)
Its accepted types and value ranges. For example, a parameter with name myOp with expression mode operator requires an operator with one punctuation-expecting input port and two punctuation-free output ports.
If it has any default value.

Configurations constraints

Describe any configurations constraints that the composite operator has that can interact with the application configurations, such as partition and host placement constraints.

Example usage

Include an example of a composite operator use. Choose an example that is a complete SPL program that can be compiled and run by a third party. For structural composite operators, include a figure of the composite operator expansion that results from the composite operator use.

Toolkits

Toolkit authors can document SPL artifacts by adding descriptive information into the toolkit information model, primitive operator models, SPL files, and native function models. For examples of toolkit documentation, refer to the Database Toolkit and the Financial Services Toolkit. The toolkit documentation includes the following items:

Summary of changes: Provide a list of updates to the toolkit from a previous version (if any). As a best practice, the toolkit starts with version number 1.0.0 and incrementally changes version numbers according to a VRM (Version, Release, and Modification) format. Toolkit versions are specified in the Toolkit information model (info.xml). Summary of changes can be placed in file that is named CHANGES in the toolkit top-level directory.
Overview: Describe briefly the operators, composite operators, and functions that are provided by the toolkit. A short overview of the toolkit is placed in the toolkit information model (info.xml) file by using the <description> tag. A more detailed description can be placed in a README file in the toolkit top-level directory.
Installation instructions: Provide precise instructions on how to download and install the toolkit, and any dependencies that are required by the toolkit implementation (such as libraries).
Stream types: Describe all stream types that are included in the toolkit that can be used by developers that use the toolkit.
Native functions: Include the signature and a description of the operation that is performed by all native functions that are visible to developers. Functions are documented with the <description> tags in the function.xml file in the native.function directory.
Operators and composite operators: Document all operators and composite operators available in the toolkit according to the guidelines provided in the operators and composite operators sections.
Helper artifacts: Document function and usage of auxiliary tools (such as scripts) that are included in the toolkit.
Example usage: Include an example of an application that uses operators, composite operators, functions, and stream types in the toolkit. Choose an example that is a complete SPL program that can be compiled and run by a third party. The source files can be placed in a directory that is named samples in the toolkit root directory.