Skip to main content

By clicking Submit, you agree to the developerWorks terms of use.

The first time you sign into developerWorks, a profile is created for you. Select information in your profile (name, country/region, and company) is displayed to the public and will accompany any content you post. You may update your IBM account at any time.

All information submitted is secure.

  • Close [x]

The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerworks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

By clicking Submit, you agree to the developerWorks terms of use.

All information submitted is secure.

  • Close [x]

developerWorks Community:

  • Close [x]

Installing and configuring InfoSphere Streams on a virtual machine

RedHat Enterprise Linux on VMware

Edward J Pring, Senior Software Engineer, IBM
Photo of Edward Pring
Edward Pring is a Senior Programmer at the IBM T.J. Watson Research Center. He has contributed to a wide range of IBM products and technologies, including operating systems, publishing applications and terminal emulators for mainframes, virus protection for personal computers, network automation for the Digital Immune System, and visualization and performance analysis for Web Services. He is currently developing streaming applications for financial services. His patent portfolio spans all of these fields. He holds an M.S. degree in computer science from New York University and a B.S. degree in mathematics from Stanford University.

Summary:  IBM® InfoSphere™ Streams is designed for large streaming applications that may span many Linux servers. When developing applications for InfoSphere Streams, or if you are just evaluating the product, you may find it more convenient to install it onto a virtual machine. Installing onto a virtual machine enables you to design and test streaming applications from your regular laptop or workstation computer. This tutorial provides a step-by-step procedure for installing and configuring InfoSphere Streams V1.2 with Red Hat Enterprise Linux and Eclipse on a VMware virtual machine.

Date:  08 Apr 2010
Level:  Intermediate PDF:  A4 and Letter (1377 KB | 31 pages)Get Adobe® Reader®

Activity:  33530 views
Comments:  

Introduction

Supported version

This tutorial applies to InfoSphere Streams V1.2, not to later versions of the product.

IBM InfoSphere Streams provides a highly scalable platform for analyzing structured and unstructured data while it is in motion. InfoSphere Streams provides an intuitive and extensible development environment for creating, compiling, and deploying streaming applications.

Streaming applications are composed of streams (reliable, ordered, one-way message flows), operators (configurable functions that filter, aggregate, enrich, or transform the messages in streams) and adapters (specialized operators that continuously ingest data and output analysis results).

InfoSphere Streams provides a rich set of general-purpose operators, plus containers for reusing existing C/C++ and Java® code as streaming operators. InfoSphere Streams can also be extended with toolkits of domain-specific operators.

Streaming applications are declared as a data flow graph with the Stream Processing Language. The flow graph specifies the data types the application's streams will carry, which adapters and operators will process the data as it flows through the application, and how the operators will be interconnected by streams. Figure 1 illustrates the data flow graph for a streaming application.


Figure 1. Streaming application flow graph
Graph depicting the flow from streams being integrated into a InfoSphere Streams application.

Large streaming applications can span more than a hundred Linux server machines. When developing applications for InfoSphere Streams, you may find it more convenient to install it onto a virtual machine. Installing onto a virtual machine enables you to design and test streaming applications from your regular laptop or workstation computer.

This tutorial guides you through a step-by-step procedure for creating a self-contained InfoSphere Streams development environment on a virtual machine. To accomplish this, you install and configure these four software products:

This tutorial outlines the specific installation steps you need to take with each product and suggests specific values for many configuration steps. However, you should refer to the official documentation for each product for details, options, and clarification. Refer to the Resources section of this tutorial for links to the products' documentation.

Following are the main tasks covered by the tutorial:

  • Obtain product distribution packages
  • Install VMware
  • Install and configure Red Hat Enterprise Linux
  • Install IBM InfoSphere Streams
  • Install Eclipse and InfoSphere Streams Studio
  • Verify the install

Many of the steps depend on previous steps, so you should execute all the steps in the order in which they are presented.

1 of 10 | Next

Comments



static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Information Management
ArticleID=480869
TutorialTitle=Installing and configuring InfoSphere Streams on a virtual machine
publish-date=04082010
author1-email=pring@us.ibm.com
author1-email-cc=