Retrieving Output from Syntax Commands (R)
Functionality provided with the IBM® SPSS® Statistics - Integration Plug-in for R allows you to access output from IBM SPSS Statistics syntax commands in a programmatic fashion. To retrieve command output, you first route it via the Output Management System (OMS) to an area in memory referred to as the XML workspace where it is stored as an XPath DOM that conforms to the Output XML Schema (xml.spss.com/spss/oms). Output is retrieved from this workspace with functions that employ XPath expressions.
Constructing the correct XPath expression (IBM SPSS Statistics currently supports XPath 1.0) requires an understanding of the Output XML schema. Documentation for the output schema is available from the Help system.
Example
In this example, we'll use output from the DESCRIPTIVES
command to determine the percentage of valid
cases for a specified variable.
*Route output to the XML workspace.
OMS SELECT TABLES
/IF COMMANDS=['Descriptives'] SUBTYPES=['Descriptive Statistics']
/DESTINATION FORMAT=OXML XMLWORKSPACE='desc_table'
/TAG='desc_out'.
DESCRIPTIVES VARIABLES=mpg.
OMSEND TAG='desc_out'.
*Get output from the XML workspace using XPath.
BEGIN PROGRAM R.
handle <- "desc_table"
context <- "/outputTree"
xpath <- paste("//pivotTable[@subType='Descriptive Statistics']",
"/dimension[@axis='row']",
"/category[@varName='mpg']",
"/dimension[@axis='column']",
"/category[@text='N']",
"/cell/@number")
res <- spssxmlworkspace.EvaluateXPath(handle,context,xpath)
ncases <- spssdata.GetCaseCount()
cat("Percentage of valid cases for variable mpg: ",
round(100*as.integer(res)/ncases),"%")
spssxmlworkspace.DeleteXmlWorkspaceObject(handle)
END PROGRAM.
- The
OMS
command is used to direct output from a syntax command to the XML workspace. TheXMLWORKSPACE
keyword on theDESTINATION
subcommand, along withFORMAT=OXML
, specifies the XML workspace as the output destination. It is a good practice to use theTAG
subcommand, as done here, so as not to interfere with any other OMS requests that may be operating. The identifiers used for theCOMMANDS
andSUBTYPES
keywords on theIF
subcommand can be found in the OMS Identifiers dialog box, available from the Utilities menu in IBM SPSS Statistics. - The
XMLWORKSPACE
keyword is used to associate a name with this XPath DOM in the workspace. In the current example, output from theDESCRIPTIVES
command will be identified with the name desc_table. You can have many XPath DOM's in the XML workspace, each with its own unique name. - The
OMSEND
command terminates activeOMS
commands, causing the output to be written to the specified destination--in this case, the XML workspace. - You retrieve values from the XML workspace with the
spssxmlworkspace.EvaluateXPath
function. The function takes an explicit XPath expression, evaluates it against a specified XPath DOM in the XML workspace, and returns the result as a vector of character strings. - The first argument to the
EvaluateXPath
function specifies the XPath DOM to which an XPath expression will be applied. This argument is referred to as the handle name for the XPath DOM and is simply the name given on theXMLWORKSPACE
keyword on the associatedOMS
command. In this case the handle name is desc_table. - The second argument to
EvaluateXPath
defines the XPath context for the expression and should be set to"/outputTree"
for items routed to the XML workspace by theOMS
command. - The third argument to
EvaluateXPath
specifies the remainder of the XPath expression (the context is the first part) and must be quoted. Since XPath expressions almost always contain quoted strings, you'll need to use a different quote type from that used to enclose the expression. For users familiar with XSLT for OXML and accustomed to including a namespace prefix, note that XPath expressions for theEvaluateXPath
function should not contain theoms:
namespace prefix. - The XPath expression in this example is specified
by the variable xpath. It is not
the minimal expression needed to select the value of interest but
is used for illustration purposes and serves to highlight the structure
of the XML output.
//pivotTable[@subType='Descriptive Statistics']
selects the Descriptives Statistics table./dimension[@axis='row']/category[@varName='mpg']
selects the row for the variable mpg./dimension[@axis='column']/category[@text='N']
selects the column labeled N (the number of valid cases), thus specifying a single cell in the pivot table./cell/@text
selects the textual representation of the cell contents. - When you have finished with a particular output item,
it is a good idea to delete it from the XML workspace. This is done
with the
DeleteXmlWorkspaceObject
function, whose single argument is the name of the handle associated with the item.
If you're familiar with XPath, you might want to convince yourself that the number of valid cases for mpg can also be selected with the following simpler XPath expression:
//category[@varName='mpg']//category[@text='N']/cell/@text
Note: To the extent possible,
construct your XPath expressions using language-independent attributes,
such as the variable name rather than the variable label. That will
help reduce the translation effort if you need to deploy your code
in multiple languages. Also consider factoring out language-dependent
identifiers, such as the name of a statistic, into constants. You
can obtain the current language used for pivot table output with the spsspkg.GetOutputLanguage
function.
You may also consider using text_eng
attributes in place of text
attributes
in XPath expressions. text_eng
attributes
are English versions of text
attributes
and have the same value regardless of the output language. The OATTRS
subcommand of the SET
command specifies whether text_eng
attributes are included in OXML
output.