IBM InfoSphere Streams Version 4.1.1

Operator fusion in parallel regions

When you compile an application, you can fuse multiple operator instances into units called partitions. For example, you can fuse operators by using the partitionColocation placement option such that the operator instances always run in the same partition. In general, there is a one-to-one correspondence between partitions and processing elements (PEs) at run time, since each partition runs in exactly one PE. User-defined parallelism, however, affects operator fusion. For example, if a logical PE (also known as a logical partition) contains only operators from inside a parallel region, that logical PE is replicated at submission time and therefore operator instances do not run in the same partition as their sibling replicas. See Example 1: PE replication for fused operators in a composite operator and Example 2: PE replication with fusion on the composite operator invocation. The replicated operators with the same partitionColocation tag might end up in different replicas, depending on the fusion. This behavior is one way in which defining parallel regions is different from invoking an operator multiple times.

Operator fusion is determined at compile time and the parallel transformations occur at submission time. Therefore, the parallel transformations use the processing elements that the fusions produce.

This expansion-after-fusion process determines which semantics the partitionColocation, partitionExlocation, and partitionIsolation config options use in a parallel region.

The behavior of the partitionIsolation and partitionExlocation options is the same inside and outside a parallel region. When the partitionIsolation option is applied to an operator, no additional operators can be in the same PE, therefore each replica of the operator is in a different PE. If the partitionExlocation option forces two operators into different PEs, the parallel transformation cannot alter that placement.