Before you start
This tutorial describes how to create a generic operator that can be used from InfoSphere Streams applications to execute SPSS predictive models. It also provide a sample operator that can be directly used with any appropriate SPSS model and a sample Streams application that demonstrates its use.
InfoSphere Streams is a platform that enables real-time analytics of data in motion. The IBM SPSS family of products provide the ability to build predictive analytic models. This "Integrating SPSS Model Scoring in InfoSphere Streams" is for Streams developers who need to leverage the powerful predictive models in a real-time scoring environment.
This tutorial extends on the non-generic operator produced in Part 1, which presents a. technique that is quite flexible but requires some C++ programming skill to customize.
In this tutorial, you learn how the non-generic operator is extended to use the predictive model's XML metadata to allow use of a SPSS predictive model in Streams without C++ skill required.
This tutorial is written for Streams component developers and application programmers who have Streams programming language skills and C++ skills. Use the tutorial as a reference, or the samples in it can be examined and executed to demonstrate the techniques described. To execute the samples, you should have a general familiarity with using a UNIX® command-line shell and working knowledge of Streams programming.
To run the examples, you need a Red Hat Enterprise Linux® box with InfoSphere Streams V2.0 or later and IBM SPSS Modeler Solution Publisher 14.2 fixpack 1, plus the Solution Publisher hot fix, which is scheduled to be available 14 Oct 2011.