IBM Streams 4.2

What's New in Version 4.2

Learn about the main new features in IBM® Streams Version 4.2.

For updates that are not associated with new features, see Documentation updates for IBM Streams Version 4.2.

New features for Version 4.2 Fix Pack 1 (Version 4.2.0.1)

Configuring strong encryption for Kerberos
The procedure to enable strong encryption for the streamtool command-line interface, the Domain Manager, and the Streams Console, no longer requires that you install the Unrestricted SDK Java™ Cryptography Extension (JCE) policy files manually. New informationLearn more...

New version management and rolling upgrade options for IBM Streams

IBM Streams Version 4.2 provides the foundation for managed version and rolling upgrade support.

Managed version support enables you to upgrade a domain and its instances independent of each other. Rolling upgrade support enables you to upgrade a domain or instance while it is running.

Version 4.2 provides the foundation for version management and rolling upgrade support because it is the earliest supported version for running an instance at a different version than its domain and the first version from which a rolling upgrade can be performed.

New informationLearn more...

New option for setting up the IBM Streams domain controller service as an unregistered service

A domain controller service runs on every resource in an IBM Streams domain and manages all of the other services on that resource. In previous versions of IBM Streams, the only option for running the domain controller service in a highly available environment was as a registered Linux system service.

Beginning in Version 4.2, the domain controller service can run as an unregistered service, which can be started by a root or non-root user. You can set up the domain controller service as a registered Linux system service or an unregistered service when you configure the resources in an IBM Streams domain.

New informationLearn more...

Support for restricting access to IBM Streams resources

Beginning in Version 4.2, IBM Streams provides tags that you can use to restrict access to a resource. This can be helpful if some resources have special capabilities, such as access to a private data source.

New informationLearn more...

Support for Kerberos authentication

Beginning in Version 4.2, you can customize IBM Streams user authentication by using Kerberos.

Kerberos is a network authentication protocol developed by the Massachusetts Institute of Technology (MIT). The Kerberos protocol uses secret-key cryptography to provide secure communications over a non-secure network. Primary benefits are strong encryption and single sign-on (SSO).

New informationLearn more...

Support for encrypted PE connections

Beginning in Version 4.2, you can use the instance.transportSecurityType property to enable or disable encrypted connections between processing elements (PEs). By default, connections between PEs are not encrypted.

New informationLearn more...

Support for using Apache Edgent with IBM Streams

Apache Edgent is an open source programming model and runtime environment for performing analytics on edge devices. An Apache Edgent application can perform simple analytics on an edge device without transmitting unnecessary data to your central analytics engine. To perform more complex analytics, you can connect the Apache Edgent application to your IBM Streams environment.

New informationLearn more...

Support for using Hyperstate Accelerator as a checkpoint data store for IBM Streams

Hyperstate Accelerator is a hardware-accelerated key-value store (KVS) that is included with IBM Streams Version 4.2. You can use Hyperstate Accelerator as a checkpoint data store for IBM Streams domains and instances.

Hyperstate Accelerator is aimed at real-time analytics where high throughput, low latency, and high availability between tasks are required while the state of the system must still be remembered after a system restart. To provide superior performance, Hyperstate Accelerator optionally uses remote direct memory access over Converged Ethernet (RoCE) for fast network access to the data, and IBM FlashSystem® for persisting data on disk to survive system restarts and for failure recovery.

New informationLearn more...

Support for developing IBM Streams applications with Python

Python is a popular language with a large and comprehensive standard library as well as many third-party libraries. The new IBM Streams Python Application API, which is included in the Topology Toolkit, enables you to create streams processing applications using Python callable classes or functions.

The Python Application API supports common operations, such as source, filter, transform, parallel, union, sink, publish, and subscribe.

New informationLearn more...

Support for compiling ODM rules into SPL for use in IBM Streams applications

IBM Operational Decision Manager (ODM) allows developers and business analysts to create business rules, construct rule flows, and create and deploy rules applications in ODM. In previous versions of IBM Streams, the Rules Toolkit allowed running ODM rules in an ODM installation against streaming data in an IBM Streams application.

The new Rules Compiler and Rules Compiler Toolkit enable you to convert business rules that are written in ODM into an SPL composite that can be incorporated into IBM Streams applications. This provides superior performance compared to the existing Rules Toolkit and does not require an ODM installation.

In addition, the new rules development support in Streams Studio enables developers and business analysts to create rules, convert them into SPL, and use them in their IBM Streams application from within a single development environment.

New informationLearn more...

New submission-time fusion option for better control over how jobs run

In previous versions of IBM Streams, the placement or fusion of operators into processing elements (PEs) was determined when you compiled your application. If an application contained many operators and each operator was fused into a separate PE, performance could be affected at run time. The only way to change how operators were fused was to change the application source and recompile the application.

By using submission-time fusion, you can control how operators are fused into PEs when you submit a job, which can improve runtime performance. By using new job configuration options, you can also define how job submissions are performed in your specific environment without having to recompile the application. You can use the Streams Console, Streams Studio, and the streamtool previewsubmitjob command to obtain information about a job submission before you submit the job.

If you are using existing placement configs to control fusion in an application, those placement configs are still supported.

New informationLearn more...

Improved threading model for better performance

An improved threading model means that you can improve application performance by manually configuring PE threading, or by having the system determine threading behavior automatically.

New informationLearn more...

Support for nested user-defined parallelism

Beginning in Version 4.2, IBM Streams supports nested user-defined parallelism, which allows for parallel regions to contain other parallel regions in your IBM Streams applications. The ability of parallel regions to be nested allows toolkit developers to incorporate user-defined parallelism (UDP) into their operators while allowing those operators to be incorporated into parallel regions.

New informationLearn more...

Support for asynchronous non-blocking checkpointing of operator states

In IBM Streams Version 4.2, you can implement asynchronous non-blocking checkpointing in stateful primitive operators. Non-blocking checkpointing of operator state data reduces the time that the tuple flow is blocked during checkpointing.

New informationLearn more...

New SPL compiler option for building optimized code

Beginning in IBM Streams Version 4.2, the SPL compiler builds optimized code by default. Optimizing the code disables SPL assertions.

To disable code optimization, you can use the new --no-optimized-code-generation option of the sc command. The -a (--optimized-code-generation) option of the sc command to enable code optimization can still be used.

New informationLearn more...

Toolkit updates

The following toolkit updates are included in IBM Streams Version 4.2:

  • Rules Compiler Toolkit: You can use this new toolkit and the Rules Compiler to convert business rules that are written in ODM into SPL that can be used in IBM Streams applications.
  • Topology Toolkit: You can now use this toolkit to develop IBM Streams applications with Python.
  • SPL standard toolkit: The filter parameter on the Import operator supports additional data types: rstring, float64, float32, int64, int32, int16, int8, uint64, uint32, uint16, uint8, and boolean literal value. For more information, see the SPL standard toolkit documentation.
  • Requirements and restrictions for several of the specialized toolkits are updated.
  • Migration requirements for applications that use the DPS, HBase, and TimeSeries toolkits are added. For more information, see the "Migrating applications" section for your version in the migration guidelines.

New and changed streamtool commands

IBM Streams Version 4.2 includes several new streamtool commands, such as previewsubmitjob, getinstancestate, and history. New commands related to devices and application configurations support the Apache Edgent integration. New commands were also added for the rolling upgrade, Kerberos authentication, and running the domain controller service as an unregistered service.

Additional updates to existing streamtool commands now let you specify a job name. Commands that support specifying a job name include canceljob, getapplicationlog, lsjobs, lspes, lsrestartrecs, restartpe, getjobtopology, and updatepe, among others. The --numresources option was updated to support explicitly requested resources.

There are also new and updated commands for the Version 4.2 serviceability enhancements.

For more information about streamtool commands, see the New informationCommand reference or enter streamtool man command-name.

Serviceability enhancements

The following table summarizes the Serviceability enhancements in Version 4.2.
Table 1. Serviceability enhancements
Enhancement Learn more...
The minimum Linux user limit (ulimit) value requirements for IBM Streams are increased, and the documentation is improved. Guidelines for configuring Linux ulimit settings for IBM Streams
Additional system compatibility checks are added to improve the detection of configuration issues. For example, name resolution and firewall checks are added for environments with multiple resources. Planning roadmap
The streamtool getlog command collects more information that can help you debug issues with IBM Streams. For example, the command now collects output that is similar to the output provided by the following streamtool commands:
  • lspes
  • getdomainstate
  • getresourcestate
  • getzkstate
The getlog command collects system logs and other information, such as processes that are running and RPMs that are installed. The ability to append the timestamp to the getlog TGZ file name is also added.
Log and trace services

streamtool getlog command

IBM Streams provides metrics to help evaluate the health of IBM Streams services, to aid in diagnosing performance issues, and to analyze throughput of requests.
You can use the Streams Console or the following streamtool commands to view the metrics data:
  • checkdomainmetrics
  • checkinstancemetrics
  • checkresourcemetrics
Metrics

Administering a domain in the Streams Console

streamtool checkdomainmetrics command

streamtool checkinstancemetrics command

streamtool checkresourcemetrics command

The domain.jvmSizeComputationEnabled property is added and set to true, by default. If you do not explicitly set the maximum JVM size, this property controls whether IBM Streams tries to select a maximum JVM size based on system memory usage.

The default JVM size for all IBM Streams services is increased to 1024 megabytes. You can increase the JVM size by using domain and instance properties. If you are running the domain controller service as a Linux system service, you can also increase the JVM size for the controller by using the streamtool registerdomainhost, chdomainhostconfig, getdomainhostconfig, and rmdomainhostconfig commands.

Configuration settings for Java memory issues
You can monitor ZooKeeper performance by using the Streams Console or the streamtool checkzk command.

You can reset the ZooKeeper server and connection statistics for the ZooKeeper ensemble by using the Streams Console or the streamtool resetzkstat command.

Administering a domain in the Streams Console

streamtool checkzk command

streamtool resetzkstat command

Metrics

Documentation for configuring audit logging is improved. Configuring audit logging for IBM Streams
Documentation for configuring IBM Streams to log events and messages in the Linux system log is improved. Logging events and messages in the Linux system log
Documentation that you need to gather before contacting IBM Support is improved and includes links to additional IBM Streams resources. A link to the new IBM Streams Problem Must Gather Information Technote is added to the Before contacting IBM Support procedure in the product documentation.