spss.StartProcedure Function (Python)

spss.StartProcedure(procedureName,omsIdentifier). Signals the beginning of pivot table or text block output. Pivot table and text block output is typically associated with procedures. Procedures are user-defined Python functions or custom Python classes that can read the data, perform computations, add new variables and/or new cases to the active dataset, create new datasets, and produce pivot table output and text blocks in the IBM® SPSS® Statistics Viewer. Procedures have almost the same capabilities as built-in IBM SPSS Statistics procedures, such as DESCRIPTIVES and REGRESSION, but they are written in Python by users. You read the data and create new variables and/or new cases in the active dataset using the Cursor class, or create new datasets with the Dataset class. Pivot tables are created using the BasePivotTable class. Text blocks are created using the TextBlock class.

  • The argument procedureName is a string and is the name that appears in the outline pane of the Viewer associated with the output. If the optional argument omsIdentifier is omitted, then procedureName is also the command name associated with this output when routing it with OMS (Output Management System), as used in the COMMANDS keyword of the OMS command.
  • The optional argument omsIdentifier is a string and is the command name associated with this output when routing it with OMS (Output Management System), as used in the COMMANDS keyword of the OMS command. If omsIdentifier is omitted, then the value of the procedureName argument is used as the OMS identifier. omsIdentifier is only necessary when creating procedures with localized output so that the procedure name can be localized but not the OMS identifier. See the topic Localizing Output from Python Programs for more information.
  • In order that names associated with output not conflict with names of existing IBM SPSS Statistics commands (when working with OMS), it is recommended that they have the form yourcompanyname.com.procedurename.
  • Within a StartProcedure-EndProcedure block you cannot use the spss.Submit function. You cannot nest StartProcedure-EndProcedure blocks.
  • Within a StartProcedure-EndProcedure block, you can create a single cursor instance.
  • Instances of the Dataset class created within StartProcedure-EndProcedure blocks cannot be set as the active dataset.
  • Output from StartProcedure-EndProcedure blocks does not support operations involving data in different split groups. When working with splits, each split should be treated as a separate set of data. To cause results from different split groups to display properly in custom pivot tables, use the SplitChange function. Use the IsEndSplit method from the Cursor class to determine a split change.
  • spss.StartProcedure must be followed by spss.EndProcedure.

    Note: You can use the spss.Procedure class to implicitly start and end a procedure without the need to call StartProcedure and EndProcedure. See the topic spss.Procedure Class (Python) for more information.

Example

As an example, we will create a procedure that calculates group means for a selected variable using a specified categorical variable to define the groups. The output of the procedure is a pivot table displaying the group means. For an alternative approach to creating the same procedure, but with a custom class, see the example for the spss.BaseProcedure class.

def groupMeans(groupVar,sumVar):

    #Determine variable indexes from variable names
    varCount = spss.GetVariableCount()
    groupIndex = 0
    sumIndex = 0
    for i in range(varCount):
        varName = spss.GetVariableName(i)
        if varName == groupVar:
            groupIndex = i
            continue
        elif varName == sumVar:
            sumIndex = i
            continue

    varIndex = [groupIndex,sumIndex]
    cur = spss.Cursor(varIndex)
    Counts={};Statistic={}
    
    #Calculate group sums
    for i in range(cur.GetCaseCount()):
        row = cur.fetchone()
        cat=int(row[0])
        Counts[cat]=Counts.get(cat,0) + 1
        Statistic[cat]=Statistic.get(cat,0) + row[1]

    cur.close()
    
    #Call StartProcedure 
    spss.StartProcedure("mycompany.com.groupMeans")

    #Create a pivot table
    table = spss.BasePivotTable("Group Means","OMS table subtype")
    table.Append(spss.Dimension.Place.row,
                 spss.GetVariableLabel(groupIndex))
    table.Append(spss.Dimension.Place.column,
                 spss.GetVariableLabel(sumIndex))

    category2 = spss.CellText.String("Mean")
    for cat in sorted(Counts):
        category1 = spss.CellText.Number(cat)
        table[(category1,category2)] = \
        spss.CellText.Number(Statistic[cat]/Counts[cat])

    #Call EndProcedure
    spss.EndProcedure()
  • groupMeans is a Python user-defined function containing the procedure that calculates the group means.
  • The arguments required by the procedure are the names of the grouping variable (groupVar) and the variable for which group means are desired (sumVar).
  • The name associated with output from this procedure is mycompany.com.groupMeans. The output consists of a pivot table populated with the group means.
  • spss.EndProcedure marks the end of output creation.

Saving and Running Procedures

To use a procedure you have written, you save it in a Python module on the Python search path so that you can call it. A Python module is simply a text file containing Python definitions and statements. You can create a module with a Python IDE, or with any text editor, by saving a file with an extension of .py. The name of the file, without the .py extension, is then the name of the module. You can have many functions in a single module. To be sure that Python can find your new module, you may want to save it to your Python "site-packages" directory, typically /Python310/Lib/site-packages.

For the example procedure described above, you might choose to save the definition of the groupMeans function to a Python module named myprocs.py. And be sure to include an import spss statement in the module. Sample command syntax to run the function is:

import spss, myprocs
spss.Submit("get file='/examples/data/Employee data.sav'.")
myprocs.groupMeans("educ","salary")
  • The import statement containing myprocs makes the contents of the Python module myprocs.py available to the current session (assuming that the module is on the Python search path).
  • myprocs.groupMeans("educ","salary") runs the groupMeans function for the variables educ and salary in /examples/data/Employee data.sav.

Result

Figure 1. Output from the groupMeans procedure
Output from the groupMeans procedure