IBM InfoSphere Streams Version 4.1.1

What’s new?

The Telecommunications Event Data Analytics toolkit version 1.0.2 release provides several new features. The highlights of the new release are:

  • Benefit from the improved housekeeping of your ITE application
  • Benefit from the new ITE application configuration parameters for input file archival and storage type
  • Configure the operator fusion for performance improvements or test purpose of your ITE application
  • Start one ITE application multiple times using the new ite.jobName submission time parameter
  • Benefit from the improved application resiliency and error reports of the Lookup Manager application
  • High availability feature is added to the teda.demoapp sample application
  • Ensure more resilient processing of streaming data by use of operators that support consistent region
  • The CSVParse operator provides new custom output functions to check for presence of source values
  • The BloomFilter operator supports eviction of too old data on its own

The following defects are resolved with Telecommunications Event Data Analytics toolkit version 1.0.2:

  • Application Framework
    • The Monitoring GUI supports user names with hyphen or underscore characters.
    • The Monitoring GUI settings dialogue suppresses the password in the tooltip.
    • The Monitoring GUI is able to show the jobs even if instance is not in running state.
    • The ITE application in multihost configuration assigns a hostPool to all PEs, which was missing for one Processing Element in earlier versions.
    • The teda-shutdown-job command supports the ‘--User’ option, which is required for LDAP users.
    • The filename deduplication component respects the reprocess flag now, that allows for bypassing filename deduplication on a per file basis.

Benefit from the improved housekeeping of your ITE application

Configure the time to keep the checkpoint files of your ITE application with finer granularity with the following new configuration parameters:
  • ite.businessLogic.group.custom.timeToKeep
  • ite.businessLogic.group.deduplication.timeToKeep
  • ite.ingest.deduplication.timeToKeep

The teda.demoapp sample application demonstrates the usage of the custom context housekeeping. The housekeeping duration for the custom groups is reduced.

Benefit from the new ITE application configuration parameters for input file archival and storage type

Configure the ite.storage.type parameter with the new 'noFile' option if your ITE application does not write any output files per input file. Use this new option for ITE applications, which create output files in custom group operators only, for example, aggregated or correlated data.

Configure the input file archival with the new ite.ingest.archiveMode parameter and optimize the input file archiving when having multiple input directories.

Configure the operator fusion for performance improvements or test purpose of your ITE application

You can unfuse parts of the application for testing and debugging purposes. This helps you to identify performance bottlenecks or memory leaks in the application by inspecting congestion factors or memory usage of single operators in StreamsStudio. For details see the configuration parameters ite.fuse.group.operators and ite.fuse.chain.operators.

Start one ITE application multiple times using the new ite.jobName submission time parameter

You can start one ITE application multiple times now, by specifying a unique job name at submission time. So far you needed two maintain two copies of the same ITE application with different namespaces and compile both of them to achieve this. A typical use case is to start the same application with another set of submission-time values. For details see the configuration parameter ite.jobName.

High availability feature is added to the teda.demoapp sample application

If Processing Elements are stopped or the job has the state 'unhealthy', for example in case of a host failure, then the application is automatically restarted. After job restart, the application is able to recover with the checkpoint files and resumes the file processing. You can easily add this feature to your custom ITE application project. For details, see the 'README' file of the teda.demoapp sample application.

The CSVParse operator provides new custom output functions to check for presence of source values

Use the new functions, if you want to know if certain output attributes have been assigned from actual values in the input line, or just got default values because they were not present in the input. The operator provides the new functions IsPresent() and PresenceMask() for this purpose. For details see the documentation of the CSVParse operator.

Benefit from the improved application resiliency and error reports of the Lookup Manager application.

The Lookup Manager application is more robust now. Operation errors no longer force a restart of the application. The Lookup Manager application detects missing data source files, unknown commands, or corrupted data and reports the errors to the <date>_LookupManagerErrorStatistics.txt output file.

Related reference:
  • Reference > Toolkits > Specialized toolkits > com.ibm.streams.teda > Operating applications > Monitoring the Lookup Manager and ITE applications > Lookup Manager error statistics

Operators that support consistent region

The following operators can be in a consistent region with any of its possible parameters.
  • ASN1Encode
  • ASN1Parse
  • CSVParse
  • StructureParse
The following operator can be in a consistent region with a limited set of parameters or application topology configurations.
  • BloomFilter
The following operators are not supported in a consistent region.
  • ScheduledBeacon
  • ExceptionCatcher
What’s new in version 1.0.1?
The Telecommunications Event Data Analytics toolkit version 1.0.1 release provides several new features.