IBM Spectrum Symphony and R

R is an open source software environment for statistical computing and graphics. Combine the power of IBM® Spectrum Symphony or IBM Spectrum Symphony Developer Edition directly within R by submitting workload to IBM Spectrum Symphony. IBM Spectrum Symphony provides a set of API function calls for R, so that you can send R functions, along with its input, as workload to IBM Spectrum Symphony. IBM Spectrum Symphony supports running your R scripts on both the local side and the IBM Spectrum Symphony service side.

For more information on R programming, refer to http://www.r-project.org.

To use R scripts with IBM Spectrum Symphony, ensure R version 2.14.2 or 3.2.2 for 64-bit Linux®, or R version 3.4.3 for 64-bit Linux or Linux on POWER®, is installed and configured on your client host and compute hosts. You can then source the IBM Spectrum Symphony environment, register the application profile (one is provided with IBM Spectrum Symphony Developer edition in $SOME_HOME/$SOAM_VERSION/Samepls/R/Samples), and finally run the Rscript non_blocking_example.R sample.

Limitations

Note the following limitations to using R with IBM Spectrum Symphony:
  • MapReduce workload is not supported with R.
  • The R function set is not thread safe; ensure that the R APIs in the client script are called in a single thread.
  • On the client side, the C++ RClient agent process has blocked signals. This is to avoid exiting abnormally when receiving unexpected signals (such as SIGTERM) from the client script.
  • The R input function, input parameter list, and output object will be serialized to raw vectors internally. The maximum vector size is limited to (2^31 minus 1) bytes.

IBM Spectrum Symphony environment variables for R integration

Configuring IBM Spectrum Symphony to use R requires setting environment variables on both the client side and the service side:
  • To configure R with IBM Spectrum Symphony on the client side, set these IBM Spectrum Symphony client environment variables:
    • SYMPHONY_R_LOG_STDIO
    • SYMPHONY_R_LOG_LEVEL

    For example, if you set the SYMPHONY_R_LOG_LEVEL environment variable to a value of INFO, the level of logs on the client side will be set to informational logs.

  • To configure R with IBM Spectrum Symphony on the service side, set the SYMPHONY_R_LOG_LEVEL under the Service > OsTypes > osType > env > name section of the application profile.

    For example, if you specify <env name="SYMPHONY_R_LOG_LEVEL">INFO</env> in the application profile, the level of logs on the service side will be set to informational logs.

IBM Spectrum Symphony R integration APIs

IBM Spectrum Symphony provides the following R API function calls so that you can submit both blocking and non-blocking workload to IBM Spectrum Symphony:
  • sym.initialize (connInfo = NULL );
  • sym.uninitialize();
  • sym.createJob(jobName, jobInfo = NULL, libraries = NULL);
  • sym.applyNonBlocking (jobName, func, params);
  • sym.getResultsNonBlocking (jobName);
  • sym.applyBlocking (jobName, func, params);
  • sym.closeJob (jobName);