Before you start
InfoSphere Streams is a platform that enables real-time analytics of data in motion. The IBM SPSS family of products provides the ability to build predictive analytic models. This "Integrating SPSS Model Scoring in InfoSphere Streams" series is for Streams developers who need to leverage the powerful predictive models in a real-time scoring environment.
This tutorial describes how to create an InfoSphere Streams operator that can be used from Streams applications to execute SPSS predictive models. It provides a sample operator and data that demonstrate this integration. It goes on to describe how the sample can be adjusted for use with any appropriate SPSS model. In Part 2, you learn how this non-generic operator is extended to use the predictive model's XML metadata to allow use of a SPSS predictive model in Streams without C++ skill required to customize.
In this tutorial, learn what a data analyst needs to do in SPSS Modeler to prepare a predictive model for scoring in Streams, see how a Streams component developer can build an operator to execute that model, and learn how a Streams application can use that operator to produce real-time scored results from streaming data.
This tutorial is written for Streams component developers and application programmers who have Streams programming language skills and C++ skills. Use the tutorial as a reference, or the samples in it can be examined and executed to demonstrate the techniques described. To execute the samples, you should have a general familiarity with using a UNIX® command-line shell and working knowledge of Streams programming.
To run the examples, you need a Red Hat Enterprise Linux® box with InfoSphere Streams V2.0 or later and IBM SPSS Modeler Solution Publisher 14.2 fixpack 1, plus the Solution Publisher hot fix, which is scheduled to be available 14 Oct 2011.