Integrate MATLAB code into InfoSphere Streams

Compile MATLAB functions into C++ libraries and import into SPL for execution in Streams applications

MATLAB is a scientific computing language and platform whose strong support for matrix manipulation and large collection of mathematical modeling libraries make it a popular implementation choice for various analytic assets. This article describes how MATLAB functions can be integrated into SPL in order to execute MATLAB code within IBM® InfoSphere® Streams applications. The integration does not require any changes to the MATLAB code. It relies on the MATLAB support for compiling MATLAB code into C++ shared libraries, and the SPL support for interfacing with native functions.

Share:

Bugra Gedik, Dr. (bgedik@us.ibm.com), Research Staff Member, IBM China

Photograph of author Dr. GedikDr. Bugra Gedik is currently a faculty member in the Computer Engineering Department, Bilkent University in Ankara, Turkey, and a consultant for Korvus Bilisim R&D in Ankara, Turkey. Prior to that, he worked as a research staff member at the IBM Thomas J. Watson Research Center. His research interests are in distributed data-intensive systems with a particular focus on stream computing. In the past, he served as chief architect for InfoSphere Streams. He is the co-inventor of the SPL and the SPADE stream processing languages. He was named an IBM Master Inventor and is the recipient of an IBM Corporate Award for his pioneering work in the System S project.



09 June 2011

Also available in Chinese Portuguese

Overview

This article describes how to execute MATLAB functions from within InfoSphere Streams applications. MATLAB is a scientific computing language and platform. The strong support for matrix manipulation and large collection of mathematical modeling libraries make MATLAB a popular implementation choice for various analytic assets. Re-implementing these analytic assets in other languages is often cumbersome due to the need for finding external libraries that will provide the functionality already built into MATLAB. This creates a strong motivation to integrate functions written in MATLAB into InfoSphere Streams applications without manually rewriting the function logic in C++ or Java™, which are the first class languages supported by Streams.

MATLAB supports compiling functions written in MATLAB programming language (in .m files) into C++ shared libraries. It also provides APIs to write C++ code that interacts with the generated library, such as utility classes that provide matrix data types. In addition, MATLAB also provides a runtime library to link against, for applications that make use of MATLAB routines compiled into C++. The overall approach of integrating MATLAB functions into Streams is as follows:

  1. Develop your MATLAB function(s).
  2. Use MATLAB compiler to create a C++ shared library.
  3. Create a native function in SPL that wraps the C++ shared library.
  4. Write an SPL application that uses the native functions.

These steps are illustrated with the following sample.

Develop your MATLAB function

You will write a simple MATLAB function that performs matrix inversion.

Create a directory called Matlab_inv and put a file named ml_inv.m in it, with the following content:

  function out = ml_inv(in)
  % ML_INV Matrix inverse
  % Invert a given matrix
  out = inv(in);

Your goal is to be able to easily use this matrix inverse function in SPL code. The following is a sample SPL function that utilizes the matrix inverse function ml::inv that wraps the ml_inv function from MATLAB. In the rest of this article, you will look at the steps needed to create the ml::inv function.

  void foo() 
  {   // Sample SPL code that uses the matrix inverse
      list<list<float64>> inM = [[1.0,3.0],[2.0,4.0]];
      mutable list<list<float64>> outM = [];
      ml::inv(outM, inM); // ml::inv is the wrapper for ml_inv
  }

Use MATLAB compiler to create a C++ shared library

You will use the MATLAB compiler to create a C++ shared library that will contain the matrix inversion function.

  1. In the MATLAB command line, type deploytool.
  2. Follow the graphical interface to create a deployment project called libMatlab_inv under the Matlab_inv directory.
  3. Click the Add File button to add the ml_inv.m file into the project.
  4. Click the Build button to create the shared library.

The following result files will be generated under the Matlab_inv/libMatlab_inv/distrib directory.

  • libMatlab_inv.h: This is the interface file that declares the C++ functions generated from the MATLAB code.
  • libMatlab_inv.so: This is the shared library that contains the C++ function implementations generated from the MATLAB code.
  • libMatlab_inv.exports: This is a text file that lists the name of the functions exported by the library. This file is not required for the operation of the library.
  • libMatlab_inv.ctf and libMatlab_inv_mcr: The former is an archive file, from which the latter directory is created by the MATLAB compiler. These contain additional support libraries. They should be located at the same place as the .so shared library.

Try out the shared library in a standalone C++ application

Before you use the shared library in a Streams application, you will try it out on a standalone C++ application.

Create a directory named sample_c++ at the same level as Matlab_inv. Inside this directory, create a file named sample.cpp and populate it as follows:

  #include "libMatlab_inv.h"
  int main()
  {
      libMatlab_invInitialize();
      double data[] = {1.0, 2.0, 3.0, 4.0};
      mwArray in(2,2,mxDOUBLE_CLASS);
      mwArray out(2,2,mxDOUBLE_CLASS);
      in.SetData(data, 4);
      ml_inv(1, out, in);
      std::cerr << "[" << out(1,1) << ", " << out(1,2) << "; "
                       << out(2,1) << ", " << out(2,2) << "]\n";
      libMatlab_invTerminate();
      return 0;
  }

In the above code, you call the libMatlab_invInitialize function to initialize the MATLAB runtime. This is one of the functions generated for you by the MATLAB compiler. You then create the following two matrices: in and out. The mwArray is a C++ class provided by the MATLAB C++ APIs for working with matrices. You can use the ml_inv function to take the inverse of the in matrix and assign it to the out matrix. This is the core function generated for you by the MATLAB compiler, based on the MATLAB function of the same name from the ml_inv.m file. Finally, you call the libMatlab_invTerminate function to finalize the MATLAB runtime. Again, this is a function generated for you by the MATLAB compiler.

To compile this program, create a Makefile as follows:

.PHONY: all clean
MATLAB_LIBRARY_LOCATION := /nfs/hny/apps01/matlab/bin/glnxa64/
MATLAB_INCLUDE_LOCATION := /nfs/hny/apps01/matlab/extern/include/
all: 
	g++ -o sample sample.cpp                           \
            -I ../Matlab_inv/libMatlab_inv/distrib         \
            -I $(MATLAB_INCLUDE_LOCATION)                  \
            -L ../Matlab_inv/libMatlab_inv/distrib         \
	    -Wl,-rpath,../Matlab_inv/libMatlab_inv/distrib \
            -L $(MATLAB_LIBRARY_LOCATION)                  \
	    -Wl,-rpath,/nfs/hny/apps01/matlab/bin/glnxa64/ \
            -lMatlab_inv -lmwmclmcrrt 

clean:
	rm sample

The following are a few things that you should notice:

  • The include path -I ../Matlab_inv/libMatlab_inv/distrib is used to specify the location of the interface file (libMatlab_inv.h) generated by the MATLAB compiler.
  • The include path -I $(MATLAB_INCLUDE_LOCATION) is used to specify the location of the interface files for the MATLAB C++ APIs. This location will be specific to your environment and depends on the location of the MATLAB installation. The variable $(MATLAB_INCLUDE_LOCATION) should be defined accordingly.
  • The library path -L ../Matlab_inv/libMatlab_inv/distrib is used to specify the location of the library (libMatlab_inv.so) generated by the MATLAB compiler. An RPATH is specified using the same path, for runtime location of the library.
  • The library path -L $(MATLAB_LIBRARY_LOCATION) is used to specify the location of the MATLAB runtime libraries. This location will be specific to your environment and depends on the location of the MATLAB installation and the architecture of your system. The variable $(MATLAB_LIBRARY_LOCATION) should be defined accordingly.
  • The library -lMatlab_inv is used to specify the name of the library generated by the MATLAB compiler (libMatlab_inv.so).
  • The library -lmwmclmcrrt is used to specify the name of the MATLAB runtime library (libmwmclmcrrt.so).

Simply type make to build the library, and ./sample to run it. It should produce the following output: [-2, 1.5000; 1, -0.5000].

Create a native function in SPL that wraps the C++ shared library

You will create native functions in SPL to make use of the functions generated by the MATLAB compiler. This could be done in one of two ways. You could either create a toolkit that encapsulates all the native functions, or directly include native functions into an application. The former is more appropriate if there will be multiple applications that will make use of the functions. For brevity, you will use the second approach here.

  1. Create a directory named sample_spl at the same level as Matlab_inv and sample_c++. This will be your application directory.
  2. Create a sub-directory named ml. This will be your namespace directory.
  3. Under ml, create a sub-directory named native.function, which will hold the function model.
  4. Under this directory, add the function model file named function.xml, with the contain the following content.
<functionModel
   xmlns="http://www.ibm.com/xmlns/prod/streams/spl/function"
   xmlns:cmn="http://www.ibm.com/xmlns/prod/streams/spl/common"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   xsi:schemaLocation="http://www.ibm.com/xmlns/prod/streams/spl/function 
                       functionModel.xsd">
  <functionSet>
    <headerFileName>spl_ml_inv.h</headerFileName>
    <functions>
      <function>
        <description>Take inverse of a matrix</description>
        <prototype><![CDATA[ public void inv(mutable list<list<float64>> r,
                             list<list<float64>> s) ]]></prototype>
      </function>
      <function>
        <description>Initialize Matlab runtime</description>
        <prototype><![CDATA[ public void initialize() ]]></prototype>
      </function>
      <function>
        <description>Finalize Matlab runtime</description>
        <prototype><![CDATA[ public void terminate() ]]></prototype>
      </function>
    </functions>
    <dependencies>
      <library>
        <cmn:description>Matrix inverse</cmn:description>
        <cmn:managedLibrary>
          <cmn:lib>Matlab_inv</cmn:lib>
          <cmn:lib>mwmclmcrrt</cmn:lib>
          <cmn:libPath>../../impl/lib</cmn:libPath>
          <cmn:libPath>/nfs/hny/apps01/matlab/bin/glnxa64/</cmn:libPath>
          <cmn:includePath>../../impl/include</cmn:includePath>
          <cmn:includePath>/nfs/hny/apps01/matlab/extern/include/</cmn:includePath>
        </cmn:managedLibrary>
      </library>
    </dependencies>
  </functionSet>
</functionModel>

In the above function model file, there are a few things to note:

  • The file spl_ml_inv.h is the header file that contains the C++ functions that will wrap the MATLAB generated functions.
  • The three SPL native functions inv, initialize, and terminate correspond to the three functions ml_inv, libMatlab_invInitialize, and libMatlab_invTerminate generated by the MATLAB compiler. Note that the inv function uses SPL nested lists to represent matrices, as there is no matrix type in SPL. As you will see, the inv function will also perform the conversion between SPL C++ types and MATLAB C++ types.
  • The libraries Matlab_inv and mwmclmcrrt are specified as dependencies, including their library paths and include paths. Note that the include and library paths for the Matlab_inv libraries are specified relative to the model file.

Now you will copy the relevant files into the impl sub-directory under the application directory sample_spl to make the application self-contained.

  1. Create a directory impl under the application directory sample_spl.
  2. Create two sub-directories under impl: include and lib. Copy the libMatlab_inv.h file from Matlab_inv/libMatlab_inv/distrib directory in to the include directory.
  3. Now copy the libMatlab_inv.so and libMatlab_inv.ctf files as well as the libMatlab_inv_mcr directory into the lib directory.
  4. Finally, create the wrapper spl_ml_inv.h header file under the include directory, with the following content.
  #include "libMatlab_inv.h"
  namespace ml {
      void initialize() {
          libMatlab_invInitialize();
      }
      void terminate() {
          libMatlab_invTerminate();
      }
      void inv(SPL::list<SPL::list<SPL::float64> > & lhs,
               SPL::list<SPL::list<SPL::float64> > const & rhs)
      {
          size_t nr = rhs.size(), nc = rhs[0].size();
          mwArray in(nr, nc, mxDOUBLE_CLASS);
          mwArray out(nr, nc, mxDOUBLE_CLASS);
          for(size_t r=0; r<nr; ++r)
              for(size_t c=0; c<nc; ++c)
                  in(r+1,c+1) = rhs[r][c];
          ml_inv(1, out, in);
          lhs.resize(nr);
          for(size_t r=0; r<nr; ++r) {
              lhs[r].resize(nc);
              for(size_t c=0; c<nc; ++c)
                  lhs[r][c] = out(r+1,c+1);  
          }      
      }
  }

In the previous code, the inv function performs transformations between the SPL list types and MATLAB mwArray types, in order to wrap the ml_inv function generated by the MATLAB compiler. Note that all the wrapper functions are placed into the namespace ml, since your native.function directory is under the namespace directory ml.

Write an SPL application that uses the native functions

You will write the SPL application that uses the native functions that wrap the functions generated by the MATLAB compiler.

Create a file named Main.spl under the sample_spl directory, with the following content:

  composite Main {
      graph
          stream<int8 dummy> Beat = Beacon() { 
              param iterations : 1u; 
          }
          () as Sink = Custom(Beat) {
              logic
                  onTuple Beat: {
                      ml::initialize();
                      list<list<float64>> inM = [[1.0,3.0],[2.0,4.0]];
                      mutable list<list<float64>> outM = [];
                      ml::inv(outM, inM);
                      println(outM);
                      ml::terminate();
                  }
          }
  }

This is a sample SPL application that initializes the MATLAB runtime, performs a matrix inversion via the ml::inv function, and finalizes the MATLAB runtime.

  1. Type sc -m -M Main in the sample_spl directory to create a Makefile for this application.
  2. Then type make standalone to build it.
  3. You may see some warnings printed out. Ignore the superfluous warnings (future versions of Streams may include support for suppressing these warnings), and type ./output/bin/standalone to run the application. You should see the following result: [[-2,1.5],[1,-0.5]].

Conclusion

In this article you have seen how to integrate MATLAB functions into Streams applications, without rewriting the function logic in a different language. The approach relies on using the MATLAB compiler to convert the MATLAB code into a C++ shared library, writing a C++ native function that wraps this shared library, and importing the wrapper function into SPL using a function model. Sample source code for the example given in our discussion is provided in this article.


Download

DescriptionNameSize
Sample code for this articleMatlabStreamsIntegration.zip322KB

Resources

Learn

Get products and technologies

  • Build your next development project with IBM trial software, available for download directly from developerWorks, or spend a few hours in the SOA Sandbox learning how to implement Service Oriented Architecture efficiently..

Discuss

Comments

developerWorks: Sign in

Required fields are indicated with an asterisk (*).


Need an IBM ID?
Forgot your IBM ID?


Forgot your password?
Change your password

By clicking Submit, you agree to the developerWorks terms of use.

 


The first time you sign into developerWorks, a profile is created for you. Information in your profile (your name, country/region, and company name) is displayed to the public and will accompany any content you post, unless you opt to hide your company name. You may update your IBM account at any time.

All information submitted is secure.

Choose your display name



The first time you sign in to developerWorks, a profile is created for you, so you need to choose a display name. Your display name accompanies the content you post on developerWorks.

Please choose a display name between 3-31 characters. Your display name must be unique in the developerWorks community and should not be your email address for privacy reasons.

Required fields are indicated with an asterisk (*).

(Must be between 3 – 31 characters.)

By clicking Submit, you agree to the developerWorks terms of use.

 


All information submitted is secure.

Dig deeper into Big data and analytics on developerWorks


static.content.url=http://www.ibm.com/developerworks/js/artrating/
SITE_ID=1
Zone=Big data and analytics, Information Management
ArticleID=678295
ArticleTitle=Integrate MATLAB code into InfoSphere Streams
publish-date=06092011