IBM Streams 4.2

Specifying how operators are fused when you submit a job

When you submit a job, you can optionally specify how the operators in the application are fused into processing elements (PE). The fusion scheme that you use can impact the runtime performance of the application.

Prior to IBM® Streams Version 4.2, operators were fused into processing elements (PEs) when you compiled your application. The only way that you could change the placement of the operators was to change the application source and then recompile the application.

Starting with IBM Streams Version 4.2, the operators are fused when you submit the job. Additionally, you can specify the way in which the operators are fused.

Important: To use the new fusion schemes, you must recompile your applications with IBM Streams Version 4.2 or later. If you do not recompile your applications, they will run using the default behavior for releases prior to Version 4.2. (The operators will be fused using the legacy scheme.)

You can specify the fusion scheme by using a job configuration overlay file from the IBM Streams Console, IBM Streams Studio, or the streamtool submitjob command. Specify the fusionType parameter, which is defined in the parallelRegionConfig clause in the deploymentConfig section.

IBM Streams supports the following fusion schemes (fusionScheme):
automatic
If you specify automatic as your fusion scheme, IBM Streams determines the appropriate number of PEs to assign to the job.
Typically, this fusion scheme results in one PE per resource. However, IBM Streams might change the number of PEs produced to avoid the following situations:
  • Creating PEs that contain a very small number of operators, which can have a negative impact on performance
  • Creating PEs that contain too many operators, which can prevent IBM Streams from effectively balancing workloads by moving PEs between hosts
manual
If you specify manual as your fusion scheme, you can then specify the number of PEs to assign to the job (by specifying a value for the fusionTargetPeCount parameter). The actual number of PEs might vary based on other job configuration constraints, the specifications of the application, and the configuration of the instance where you plan to deploy the application. For example, if you specify a large number of partition ex-location constraints, the resulting application might have more PEs than you expect.
legacy
If you specify legacy as your fusion scheme, the operators are fused the same way they were prior to IBM Streams Version 4.2. Typically, each operator is fused into a separate PE if no other placement config is specified in the application bundle file.
Restriction: Any fusion scheme that you specify is influenced by the fusion constraints that are specified in the application bundle file and by the fusion constraints that are specified when the job is submitted.

You can specify only one value for the fusion scheme.

If you do not specify a fusion scheme, IBM Streams uses the default fusion scheme from the instance on which you deploy the application. For more information, see Setting the default fusion scheme at the instance or domain level. (This is the fusion scheme that is used if you set Default in the console.)

For more information on how to specify fusion schemes from the interactive streamtool interface, see streamtool submitjob command.

The following examples show different ways in which you might specify the fusion scheme from the interactive streamtool interface:
  • To use the default fusion scheme that is set at the instance or domain level:
    streamtool submitjob myBundle.sab
  • To set a target of 5 PEs for the job:
    streamtool submitjob -C fusionScheme=manual -C fusionTargetPeCount=5 myBundle.sab
  • To use the fusion scheme from versions before IBM Streams Version 4.2:
    streamtool submitjob -C fusionScheme=legacy myBundle.sab

Specifying how parallel regions are fused

If your application includes parallel regions, you can specify how the operators in the channels in the parallel regions are fused into PEs.
Restriction: If you specify legacy as your fusion scheme, you cannot specify how parallel regions are fused.
IBM Streams supports the following values for the fusionType parameter:
noChannelInfluence
IBM Streams treats the operators in a parallel region the same as the other operators in the application. Inclusion in a parallel region does not have any impact on how the operators are fused.
channelIsolation
IBM Streams fuses operators in a channel with the other operators in the same channel. They can be fused in to one or more PEs. Additionally, if you explicitly co-locate an operator in a channel with an operator that is in a different channel or outside the parallel region, these operators are fused in their own PE.
channelExlocation
IBM Streams does not fuse operators from different channels in the same PE. However, the operators can be fused with operators outside of the parallel region. Additionally, if you explicitly co-locate operators from different channels in the parallel region, these operators are fused in their own PE.