Data set configuration
Learn how to configure the data sets to be analyzed.
Sample data sets
Standard data-mining data sets are used in the Netezza Performance Server Analytics document set to provide examples of how various functions and stored procedures perform in normal operation. The data sets are also used as insights into how the various components of the product might be used in real-world scenarios.
Data set name | URL and files to download |
---|---|
Retail |
URL: fimi.ua.ac.be/data/ File: |
CensusIncome |
URL: archive.ics.uci.edu/ml/databases/census-income/ File: |
WineQuality |
URL: archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/ File: |
Adult |
URL: archive.ics.uci.edu/ml/machine-learning-databases/adult File: |
Soybean |
URL: archive.ics.uci.edu/ml/machine-learning-databases/soybean Files: |
Iris |
URL: archive.ics.uci.edu/ml/machine-learning-databases/iris/ File: |
Installing sample data sets
- Download each data set file to a local machine. If a file is packed (for example, a file with the extension .gz is packed), do not unpack it.
- Log in to the host as user
nz
. - Create a directory in which to store the downloaded data sets, for
example:
/nz/export/ae/utilities/bin/testData
- Transfer the data set files to the newly created directory. Do not change the file names.
- Navigate the following directory:
/nz/export/ae/utilities/bin
- Run the installation script by entering one of the following commands:
- If the sample data set files are in the directory
/nz/export/ae/utilities/bin/testData:
./loadTestTables.sh
- If the sample data set files are in a different
directory:
Because of the large amounts of data that the files contain, the script might run for several minutes. This is normal../loadTestTables.sh path_to_directory
- After the script finishes, temporary files created by the script are deleted automatically. However, the downloaded data files and the log files are not deleted, and remain on the host. If you do not wish to retain them, delete them manually.
- If the sample data set files are in the directory
/nz/export/ae/utilities/bin/testData:
If the script is re-run, all sample data is deleted from the database and the corresponding tables are dropped. Then, the tables are re-created and the original sample data is reinserted.
Netezza Performance Server Cartridge Manager (nzcm)
Cartridge management for Netezza Performance Server Analytics is performed using the Netezza Performance Server Cartridge Manager (nzcm) utility. Use nzcm to install, uninstall, register, unregister, and otherwise administer cartridges.
Installing nzcm
Netezza Performance Server Analytics is distributed as a collection of cartridges in the form of .nzc files. You must extract these files from the full Netezza Performance Server Analytics package. You can extract and access the cartridges and the Netezza Performance Server Cartridge Manager (nzcm) through the Netezza Performance Server Analytics installation utility.
- Log in to the host as user
nz
. - Go to the to the directory that contains the following
file:
nz-analytics-vversion.zip
- Run the following
command:
The unzip utility must be used to extract the file; gunzip cannot be used. This command creates a directory with the name nzcmrepo under the directory where the files were extracted.unzip nz-analytics-vversion.zip
- Go to to the nzcmrepo subdirectory, typically /nz/var/inza/nzcmrepo.
- Locate the
nzcm
file to determine the release number. The file is named in the formnzcm-<version>
. - Decompress the file.
tar -xf nzcm-<version>
- When decompressed, go to the
nzcm
directory:cd /nz/var/inza/nzcmrepo/nzcm-<<version>
- Install
nzcm
:./install.sh
The script installs
nzcm
to the /nz/var/nzcm directory and the repository is configured automatically. - As instructed by the output of the install.sh script,
run:
source ~/.bashrc
- Issue the following command to change to the target
directory:
cd /nz/var/inza/nzcmrepo
- Confirm that the target directory is empty.
- Decompress the cartridges and group files:
This installscp -f *.nzc /nz/var/nzcm/nzcmrepo/ cp -f *.grp /nz/var/nzcm/nzcmrepo/
nzcm
.