IBM InfoSphere Streams Version 4.1.1

Output assignments

When you write a generic primitive operator, you can allow assignments to output tuple attributes through the output clause. Output assignments can be either plain assignments or assignments with custom output functions.

There are best practices on when to use plain assignments or assignments with custom output functions. Use custom output functions in the output clause as opposed to using parameters to specify output assignments.

Use plain output assignments when tuple attribute values can be computed exclusively by using valid Streams Processing Language (SPL) expressions. One such example is the Functor operator from the SPL Standard Toolkit, which is used to transform input tuples into output ones. The following example shows an invocation of the Functor operator, where the output tuple attribute id is set to be a concatenation of the input tuple attribute id with the string "_id" (line 10). The output clause uses an SPL expression to perform the concatenation.

01: namespace example;
02:
03: composite Main {
04:   graph
05:     stream<uint32 id> SrcA = Beacon() { 
06:       param period : 1.0; 
07:     } 
08: 
09:     stream<rstring id> ConcatIDs = Functor(SrcA) { 
10:       output ConcatIDs : id = (rstring) SrcA.id + "_id"; 
11:     } 
12: }

Assignments with custom output functions are used when the computation of the output attribute involves an internal function or operator state that is not visible at the SPL level. One such example is the Aggregate operator, which allows aggregate assignments to output attributes (for example, Average, Count, and Sum). These functions operate over the internal operator windows, which are visible only at the primitive operator implementation code.

A common bad practice with primitive operators is to bypass the use of the output clause by specifying output attributes as operator parameters. The following code sample shows an example of this practice in SPL. This example has a primitive operator that is named sample::SimpleCalculator that does over a sum and a multiplication operation parameters paramA and paramB (lines 16 and 17). In addition, it uses the parameters oAttrSum and oAttrMult to define which output attributes receive the sum and multiplication result (lines 18 and 19). The problem of this approach is that the Perl code in the primitive operator implementation must inspect the input parameters and perform checks to verify that the output tuple attributes specified as parameter values indeed exist in the output stream. The checks are required because the compiler cannot check the validity of the output tuple attribute parameters that are passed as strings parameters ("sumXY" and "multXY").

01: namespace example;
02: 
03: composite Main {
04:   graph
05:     stream <int32 x, int32 y, int32 z> IntNumbers = Beacon() { 
06:       param period : 1.0; 
07:       output 
08:         IntNumbers : x = (int32) (random() * 10000.0), 
09:                      y = (int32) (random() * 10000.0), 
10:                      z = (int32) (random() * 10000.0); 
11:     } 
12:  
13:     stream <int32 sumXY, int32 multXY, int32 z> Results =  
14:       sample::SimpleCalculator(IntNumbers) { 
15:       param 
16:         paramA    : x; 
17:         paramB    : y; 
18:         oAttrSum  : "sumXY"; 
19:         oAttrMult : "multXY"; 
20:     } 
21: }

You can avoid this bad practice by using custom output functions. The use of custom output functions allows the compiler to do type checks. In addition, it makes the operator invocation in SPL more flexible, and the C++ implementation template cleaner.

Consider this example that shows the sample::SimpleCalculator operator invocation in SPL with custom output functions. In this specific example, operator parameters paramA and paramB were substituted for custom output functions parameters (lines 15-16). The parameters oAttrSum and oAttrMult are no longer necessary because the output tuple attributes are assigned with the function return value (lines 15-16).

01: namespace example;
02:
03: composite Main {
04:   graph
05:     stream <int32 x, int32 y, int32 z> IntNumbers = Beacon() { 
06:       param period : 1.0; 
07:       output 
08:         IntNumbers : x = (int32) (random() * 10000.0), 
09:                      y = (int32) (random() * 10000.0), 
10:                      z = (int32) (random() * 10000.0); 
11:     } 
12:  
13:     stream <int32 sumXY, int32 multXY, int32 z>  Results =  
14:       sample::SimpleCalculator(IntNumbers) { 
15:       output Results : sumXY = Sum(x,y),  
16:                        multXY = Multiply(x,y); 
17:     } 
18: }

This next example shows the operator model for sample::SimpleCalculator, which includes the definition of a set of custom output functions named CalculatorOperations (line 13). This set includes the signature for the Sum and Multiply functions (lines 19-29). The compiler uses this signature to do type checking on parameters and return values. This signature exempts the developer from writing extra checks in the C++ implementation template code. The model also includes a default function that is named AsIs, which allows output attributes to be auto-assigned or to be explicitly assigned with a valid SPL expression (lines 14-18 and 52-55).

01: <?xml version="1.0" ?>
02: <operatorModel
03:   xmlns"http://www.ibm.com/xmlns/prod/streams/spl/operator"
04:   xmlns:cmn="http://www.ibm.com/xmlns/prod/streams/spl/common"
05:   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance
06:   xsi:schemaLocation="http://www.ibm.com/xmlns/prod/streams/spl/operator
07: operatorModel.xsd"> 
08:   <cppOperatorModel>  
09:     <context>  
10:       <description>Simple calculator</description>  
11:       <customOutputFunctions>  
12:         <customOutputFunction>  
13:           <name>CalculatorOperations</name>
14:           <function>  
15:             <description> Return the argument 
16:                unchanged</description>
17:             <prototype><![CDATA[<any T> T AsIs(T)]]</prototype>  
18:           </function>
19:           <function>  
20:             <description>Sum parameters a and b</description>
21:             <prototype><![CDATA[int32 Sum(int32 a, int32 b)]]
22:             </prototype> 
23:           </function>  
24:           <function>
25:             <description>Multiply parameters a and b</description>  
26:             <prototype><![CDATA[int32 Multiply(int32 a,
27:                                                int32 b)]]>  
28:             </prototype>
29            </function>  
30:         </customOutputFunction> 
31:       </customOutputFunctions>
32:       <providesSingleThreadedContext>Never</providesSingleThreadedContext>
33:     </context>  
34:     <parameters>  
35:       <allowAny>false</allowAny>
36:     </parameters>
37:     <inputPorts>
38:       <inputPortSet>
39:         <tupleMutationAllowed>false</tupleMutationAllowed>
40:         <windowingMode>NonWindowed</windowingMode>
41:         <windowPunctuationInputMode>Oblivious</windowPunctuationInputMode>
42:         <cardinality>1</cardinality>
43:         <optional>false/optional>
44:       </inputPortSet>
45:     </inputPorts>
46:     <outputPorts>
47:       <outputPortSet>
48:         <expressionMode>Expression</expressionMode>
49:         <autoAssignment>true</autoAssignment>
50:         <completeAssignment>true</completeAssignment>
51:         <rewriteAllowed>true</rewriteAllowed>
52:         <outputFunctions>
53:           <default>AsIs</default>
54:           <type>CalculatorOperations</type>
55:         </outputFunctions>
56:         <windowPuntuationOutputMode>Free</windowPunctuationOutputMode>
57:         <tupleMutationAllowed>false</tupleMutationAllowed>
58:         <cardinality>1</cardinality>
59:         <optional>false</optional>
60:       </outputPortSet>
61:     </outputPorts>
62:   </cppOperatorModel>
63: </operatorModel>

In addition, the following code shows the header template of the sample::SimpleCalculator primitive operator. In addition to the tuple and punctuation processing method signatures (lines 09-10), the header template includes the signature for both the sum and multiplication methods (lines 13-14).

01: <%SPL::CodeGen::headerPrologue($model);%>
02: 
03: class MY_OPERATOR : public MY_BASE_OPERATOR 
04: {
05: public:
06:   MY_OPERATOR();
07:   ~MY_OPERATOR(); 
08: 
09:   void process(Tuple const & tuple, uint32_t port);
10:   void process(Punctuation const & punct, uint32_t port);
11:
12: private:
13:   int32_t Sum(int32_t a, int32_t b);
14:   int32_t Multiply(int32_t a, int32_t b);
15: }; 
16:
17: <%SPL::CodeGen::headerEpilogue($model);%>

This last example shows the C++ implementation template code with custom output functions. The process function includes a check to test if the function that is called is the default AsIs or one of the other custom functions (lines 35-38). When one of the other custom functions is used, the template generates code that explicitly invokes their corresponding class methods (MY_OPERATOR::Sum and MY_OPERATOR::Multiply) with parameters passed to the custom output functions at the SPL level (lines 39-44). It is important that the C++ implementation template can have any valid mixed-mode implementation for a custom output function.

01: <%SPL::CodeGen::implementationPrologue($model);%>
02:
03: <%my $inputPort = $model->getInputPortAt(0); 
04:   my $inTupleName = $inputPort->getCppTupleName();%>
05:
06: MY_OPERATOR::MY_OPERATOR()
07: {
08: }
09: 
10: MY_OPERATOR::~MY_OPERATOR() 
11: {
12: }
13: 
14: int32_t MY_OPERATOR::Sum(int32_t a, int32_t b)
15: {
16:   return a + b;
17: }
18: 
19: int32_t MY_OPERATOR::Multiply(int32_t a, int32_t b)
20: {
21:   return a * b;
22: }
23:
24: void MY_OPERATOR::process(Tuple const & tuple, uint32_t port)
25: {
26:   IPort0Type const & <%=$inTupleName%> = 
27:     static_cast<IPort0Type const&> (tuple);    
28:   OPort0Type otuple; 
29: 
30:   <% 
31:   my $oport = $model->getOutputPortAt(0); 
32:   foreach my $attribute (@{$oport->getAttributes()}) { 
33:     my $name = $attribute->getName(); 
34:     my $operation = $attribute->getAssignmentOutputFunctionName(); 
35:     if ($operation eq "AsIs") { 
36:       my $init = $attribute->getAssignmentOutputFunctionParameterValueAt(0)->getCppExpression();%> 
37:       otuple.set_<%=$name%>(<%=$init%>); 
38:     <%} else { 
39:       my $paramValues = 
40:         $attribute->getAssignmentOutputFunctionParameterValues();%> 
41:       otuple.set_<%=$name%>( 
42:         MY_OPERATOR::<%=$operation%>( 
43:           <%=$paramValues->[0]->getCppExpression()%>, 
44:           <%=$paramValues->[1]->getCppExpression()%>)); 
45:     <%}     
46:   }%> 
47:     
48:   submit(otuple, 0); 
49: } 
50:  
51: void MY_OPERATOR::process(Punctuation const & punct, uint32_t port)
52: { 
53: } 
54: 
55: <%SPL::CodeGen::implementationEpilogue($model);%>

When you write custom output functions, avoid making them change the internal state of the operator. This change can cause unexpected side effects and be confusing to the operator user. The operator documentation must explicitly state if output functions are stateful and what their side effects are.