Extension nodes

SPSS Modeler supports the languages R and Apache Spark (with Python).

Several Extension nodes are available to enable expert users to input their own R scripts or Python for Spark scripts to carry out data processing, model building, and model scoring. These Extension nodes complement SPSS Modeler and its data mining abilities.

Before you begin

You can load R and Python libraries to use with the extension nodes. To run R or Python scripts, you must first install any packages that your scripts require. To install packages, you must include the following scripts in an Extension Output node, connect it to a User Input node, and then run the Extension Output node to start the installation process.
Tip: You can also insert these scripts in front of other scripts if you want the installation and your task-related scripts to run together.
To install R packages:
  1. Run the following command:
    install.packages("$PACAGE_NAME", quiet=TRUE, repos="$REPO_URL")
    For example:
    install.packages("Sequential", quiet=TRUE, repos="https://cloud.r-project.org")
  2. To verify that the package was installed successfully, run the following command:
    packageVersion("$PACKAGE_NAME")
    For example:
    packageVersion("Sequential")
Note: If the R package isn't available in your repository, the installation may fail. In such a case, you can try the same installation command from the R command line environment (not from R Studio).
To install Python packages:
  1. Run the following command:
    import sys
    import subprocess
    subprocess.check_call([sys.executable, '-m', 'pip', 'install', '$PACKAGE_NAME', '--quiet', '--no-input'])
    For example, the following command installs numpy:
    import sys
    import subprocess
    subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'numpy', '--quiet', '--no-input'])
  2. To verify that the package was installed successfully, run the following command:
    import pkgutil
    pkgutil.ModuleInfo('$PACKAGE_NAME')
    For example, the following command verifies whether numpy is installed successfully:
    import pkgutil
    pkgutil.ModuleInfo('numpy')