Configuring the local machine

The R environment must be configured on the local machine before you can use the R functionality. Configuration includes preparing the ODBC connection between the local machine and Netezza Performance Server. It also includes installing a number of additional R packages that are not included in the base R installation.

The following sections describe how to configure the ODBC Drivers and how to configure the local machine to work with R on the Netezza Performance Server through the R GUI for Windows.

Configuring the ODBC driver for Windows

This section describes how to install and configure the ODBC driver for the 64-bit version of Windows and the 32-bit version of Windows.

  1. 1. Download the Windows ODBC drivers from Fix Central by doing the following steps:
    1. Click Select product.
    2. From the Product Group list, select Information Management.
    3. From the Select from Information Management list, select IBM Netezza NPS Software and Clients.
    4. From the Installed Version list, select the version of Netezza Performance Server that you have installed.
    5. From the Platform list, select Windows, and then click Continue.
    6. Select Browse for fixes, and then click Continue.
    7. Select the corresponding fix pack for your Netezza Performance Server version.

      The fix pack contains the nz-winclient-vxxx.zip file, where xxx is the corresponding version number.

      Extract the nz-winclient-vxxx.zip file and use one of the following files:

      • For 64-bit Windows, use the nzodbc32bit4win64.exe file.
      • For 32-bit Windows, use the nzodbcsetup.exe file.
  2. After the download is completed, double-click the file name to start the installer.
  3. In the window that opens, select the language to use and click OK.
  4. Follow the steps of the installer package by clicking Next > after each selection.

    The application installs all the necessary files on your computer. A rebooting might be necessary after installation.

  5. Click Done to finish the installation. Then, close the installer application.
  6. To check whether the installation is completed correctly, open the Control Panel and select Administrative Tools.
  7. From the list, select data sources (ODBC).
  8. In the dialog box that opens, click the Drivers tab.

    NetezzaSQL appears in the list.

  9. Click the System DSN tab.

    The NetezzaSQL driver that is named NZSQL appears.

If the local settings match, the installation is complete. If the local settings do not match, reinstall the driver.
Tip: You can define custom DSNs in the System DSN tab, if necessary.

Configuring the R package

To run the R Language, extra packages must be installed through the R GUI.

Required standard packages

For R to run properly, the following standard packages must be installed on the client. The packages are listed in alphabetical order.
arules
Provides support for association rules.
arulesViz
Necessary for the visualization of association rules as provided in the nza package.
bitops
Provides functions for bitwise operations.
ca
Provides simple correspondence analysis, multiple correspondence analysis, and joint correspondence analysis.
caTools
Provides tools for moving window statistics, GIF, Base64, ROC AUC, and others.
e1071
Provides miscellaneous functions of the Department of Statistics (e1071).
MASS
Provides support functions and Datasets for Venables and Ripley's MASS
rgl
Provides a 3D visualization device system.
RODBC
Provides ODBC database access.
tree
Provides classification and regression trees.
rpart
Provides decision and regression trees.
tree
Provides classification and regression trees.
XML
Provides tools for parsing and generating XML within R.
Note: When these packages are installed, dependent packages are also installed if required. Therefore, depending on the order in which the packages are installed, it might not be necessary to manually install each package. For example, when installing the ca package, the rgl package is automatically installed. Notifications regarding automatically installed dependencies appear in the R GUI console.

Installing the packages

To install the nzr package, the nza package, and the nzmatrix package, do the following steps.
Note: First, you must install the nzr package because is needed to use the nza package and the nzmatrix package. You also must download the NPS_R client packages by using the following GitHub link netezza-utils/R/.
  1. From the R GUI, click Packages > Install package(s) from local zip files... .

    A dialog box with a list of the available packages opens.

  2. Select the nzr package, and then click OK.
  3. Repeat step 1 and step 2 to install the nza package and the nzmatrix package.

Acquiring R

Netezza Performance Server plugins are supported for R GUI version 3.0.x for both x32 and x64. Appropriate versions of R can be downloaded from the official R website. Follow the installation instructions.

Configuration instructions for Windows

The following description shows how to install the required packages, and the nzr, nza, and nzmatrix packages by using R GUI on Windows. Steps should be similar for a different platform or client.

To install the packages, do the following steps:
  1. Update the R GUI with any appropriate CRAN package by selecting Packages > Install package(s) from local zip files... .
    Note: Using the Install Package(s)... option causes the R GUI to make a connection to a CRAN server. Therefore, it might be necessary to select the server before this process can be completed. Using this option avoids the need to manually download the packages to the local machine.
  2. From the list of available packages, select the appropriate package, and then click OK.
  3. Repeat step 1 and step 2 for each package.
  4. Download the libraries as needed.
  5. After the download is completed, from the Packages window, select Packages > Install package(s) from local zip files....
  6. Navigate to the zip file location on the local machine or network.
  7. After the file is located, double-click the file name in the window, or select it and click Open.
  8. Repeat step 5, step 6, and step 7 for each package.

Verifying installation and checking ODBC connectivity

After installing all Netezza Performance Server R Library components and completing the configuration of the ODBC driver and the database setup for the Netezza Performance Server Analytics Library for R, Netezza Performance Server R Library, and Netezza Performance Server Matrix Library components, the connectivity of the R GUI with the Netezza Performance Server appliance must be verified. In the following description, it is assumed that the DSN NZSQL is defined and refers to a database. It is also assumed that the user onNetezza Performance Server have the necessary rights to access the NZA database and to create new tables in the current database.

To verify the installation and configuration, you can use the following commands:
  • To verify the Netezza Performance Server R Library package install and proper configuration of the Netezza software run:
    library(nzr)
    This command loads the Netezza Performance Server R Library libraries into the R GUI. After the libraries are loaded, run:
    demo(nzr)
    This command runs a script that demonstrates and checks the basic functionality of the Netezza Performance Server R Library.
  • To verify the Netezza Performance Server Analytics Library for R package install and the configuration of the Netezza Performance Server software run:
    library (nza)
    This command loads the Netezza Performance Server Analytics Library for R and the Netezza Performance Server R Library libraries into the R GUI. After the load is completed, run:
    demo (nza)
    This command runs the demo script to demonstrate and check the basic functionality of the Netezza Performance Server Analytics Library for R.
  • To verify the Netezza Performance Server Matrix Library package installation and the configuration of the Netezza Performance Server software run:
    library (nzmatrix)
    This command loads the Netezza Performance Server Matrix Library and the Netezza Performance Server R Library libraries into the R GUI. After the load is completed, run:
    demo (nzmatrix)
    This command runs the demo script that demonstrates and checks the basic functions of the Netezza Performance Server Matrix Library.

Creating working databases

Before you start to do analytics by using the Netezza Performance Server client packages for R, you must create a working database to store the result tables of the analysis.
Important: Do not use system databases, such as SYSTEM, NZM, NZA, NZR, NZMSG, and NZRC to store the result tables.

The following example shows how to create the ANALYSIS_DB database. The database owner is DEVUSER.

To create the ANALYSIS_DB database, do the following steps:

  1. Log in to your Netezza Performance Server and launch nzsql.
  2. Run the following commands:
    1. CREATE USER DEVUSER WITH PASSWORD '<password>';
      Where
      <password>
      Specifies a password of your choice.
    2. ALTER USER DEVUSER WITH IN GROUP inza_admins;
    3. CREATE DATABASE ANALYSIS_DB;
    4. ALTER DATABASE ANALYSIS_DB OWNER TO DEVUSER;
    5. \c ANALYSIS_DB
    6. GRANT ALL ADMIN TO DEVUSER;
  3. Quit nzsql:
    \q
  4. Change to the /nz/export/ae/utilities/bin directory:
    cd /nz/export/ae/utilities/bin
  5. Enable the rights for the DEVUSER:
    ./create_inza_db_developer.sh ANALYSIS_DB DEVUSER
    Note: The INZA_DEVELOPERS group is for users who need to register new AEs, UDXs, and stored procedures.