Working with R Program Blocks

The keyword R on the BEGIN PROGRAM command identifies a block of R programming statements, which are processed by R.

The basic specification is BEGIN PROGRAM R followed by one or more R statements, followed by END PROGRAM.

Example

DATA LIST FREE /var1.
BEGIN DATA
1
END DATA.
DATASET NAME File1.
BEGIN PROGRAM R.
File1N <- spssdata.GetCaseCount()
END PROGRAM.
DATA LIST FREE /var1.
BEGIN DATA
1
2
END DATA.
DATASET NAME File2. 
BEGIN PROGRAM R.
File2N <- spssdata.GetCaseCount()
{if (File2N > File1N)
  message <- "File2 has more cases than File1."
else if (File1N > File2N)
  message <- "File1 has more cases than File2."
else
  message <- "Both files have the same number of cases."
}
cat(message)
END PROGRAM.
  • The first program block defines a programmatic variable, File1N, with a value set to the number of cases in the active dataset.
  • The first program block is followed by command syntax that creates and names a new active dataset. Although you cannot execute IBM® SPSS® Statistics command syntax from within an R program block, you can have multiple R program blocks separated by command syntax that performs any necessary actions. Values of R variables assigned in a given program block are available in subsequent program blocks.
  • The second program block defines a programmatic variable, File2N, with a value set to the number of cases in the IBM SPSS Statistics dataset named File2. The value of File1N persists from the first program block, so the two case counts can be compared in the second program block.
  • The R function cat is used to display the value of the R variable message. Output written to R's standard output--for instance, with the cat or print function--is directed to a log item in the IBM SPSS Statistics Viewer.

Note: To minimize R memory usage, you may want to delete large objects such as IBM SPSS Statistics datasets at the end of your R program block--for example, rm(data).

Displaying Output from R

For IBM SPSS Statistics version 18 and higher, and by default, console output and graphics from R are redirected to the IBM SPSS Statistics Viewer. This includes implicit output from R functions that would be generated when running those functions from within an R console--for example, the model coefficients and various statistics displayed by the glm function, or the mean value displayed by the mean function. You can toggle the display of output from R with the spsspkg.SetOutput function.

Accessing R Help Within IBM SPSS Statistics

You can access help for R functions from within IBM SPSS Statistics. Simply include a call to the R help function in a BEGIN PROGRAM R-END PROGRAM block and run the block. For example:

BEGIN PROGRAM R.
help(paste)
END PROGRAM.

to obtain help for the R paste function.

You can access R's main html help page with:

BEGIN PROGRAM R.
help.start()
END PROGRAM.

Debugging

For IBM SPSS Statistics version 18 and higher, you can use the R browser, debug, and undebug functions within BEGIN PROGRAM R-END PROGRAM blocks, as well as from within implementation code for extension commands implemented in R. This allows you to use some of the same debugging tools available in an R console. Briefly, the browser function interrupts execution and displays a console window that allows you to inspect objects in the associated environment, such as variable values and expressions. The debug function is used to flag a specific R function--for instance, an R function that implements an extension command--for debugging. When the function is called, a console window is displayed and you can step through the function one statement at a time, inspecting variable values and expressions.

  • Results displayed in a console window associated with use of the browser or debug function are displayed in the IBM SPSS Statistics Viewer after the completion of the program block or extension command containing the function call.

    Note: When a call to a function that generates explicit output--such as the R print function--precedes a call to browser or debug, the resulting output is displayed in the IBM SPSS Statistics Viewer after the completion of the program block or extension command containing the function call. You can cause such output to be displayed in the R console window associated with browser or debug by ensuring that the call to browser or debug precedes the function that generates the output and then stepping through the call to the output function.

  • Use of the debug and browser functions is not supported in distributed mode.
  • On Windows, you might need to set the system locale to match the SPSS Statistics output language to properly display extended characters in an R console window, even in Unicode mode.

For more information on the use of the debug and browser functions, see the R help for those functions.

R Functions that Read from stdin

Some R functions take input data from an external file or from the standard input connection stdin. For example, by default, the scan function reads from stdin but can also read from an external file specified by the file argument. When working with R functions within BEGIN PROGRAM R-END PROGRAM blocks, reading data from stdin is not supported, due to the fact that R is embedded within IBM SPSS Statistics. For such functions, you will need to read data from an external file. For example:

BEGIN PROGRAM R.
data <- scan(file="/Rdata.txt")
END PROGRAM.

Versions

Multiple versions of the IBM SPSS Statistics - Integration Plug-in for R can be used on the same machine, each associated with a major version of IBM SPSS Statistics, such as 23 or Digital. BEGIN PROGRAM R-END PROGRAM blocks automatically load the correct version of the R Integration Package for IBM SPSS Statistics, so there is no need to use the R library command to load the package.

Syntax Rules

  • Within a program block, only statements recognized by the specified programming language are allowed.
  • Within a program block, each line should not exceed 251 bytes.
  • With the IBM SPSS Statistics Batch Facility (available only with IBM SPSS Statistics Server), use the -i switch when submitting command files that contain program blocks. All command syntax (not just the program blocks) in the file must adhere to interactive syntax rules.

Operations

  • Within a BEGIN PROGRAM R block, the R functions quit() and q() will terminate the IBM SPSS Statistics session.
  • R variables that are specified in a given program block persist to subsequent program blocks.

Limitations

  • Programmatic variables created in a program block cannot be used outside of program blocks.
  • Program blocks cannot be contained within DEFINE-!ENDDEFINE macro definitions.
  • Program blocks can be contained in command syntax files run via the INSERT command, with the default SYNTAX=INTERACTIVE setting.
  • Program blocks cannot be contained within command syntax files run via the INCLUDE command.