SPSS® Modeler offers nodes for using
Python native algorithms. The Python tab on the Nodes
Palette
contains the following nodes you can use to run Python
algorithms. These nodes are supported on Windows 64, Linux64, and Mac.
|
The Synthetic Minority Over-sampling Technique (SMOTE) node provides an
over-sampling algorithm to deal with imbalanced data sets. It provides an advanced method for
balancing data. The SMOTE process node in SPSS Modeler is implemented in Python and
requires the imbalanced-learn© Python library. |
|
XGBoost Linear© is an advanced implementation of a gradient boosting
algorithm with a linear model as the base model. Boosting algorithms iteratively learn weak
classifiers and then add them to a final strong classifier. The XGBoost Linear node in SPSS Modeler is implemented in
Python. |
|
XGBoost Tree© is an advanced implementation of a gradient boosting algorithm
with a tree model as the base model. Boosting algorithms iteratively learn weak classifiers and then
add them to a final strong classifier. XGBoost Tree is very flexible and provides many parameters
that can be overwhelming to most users, so the XGBoost Tree node in SPSS Modeler exposes the core features and
commonly used parameters. The node is implemented in Python. |
|
t-Distributed Stochastic Neighbor Embedding (t-SNE) is a tool for visualizing
high-dimensional data. It converts affinities of data points to probabilities. This t-SNE node in
SPSS Modeler is implemented in Python
and requires the scikit-learn © Python library. |
|
A Gaussian Mixture© model is a probabilistic model that assumes all the data
points are generated from a mixture of a finite number of Gaussian distributions with unknown
parameters. One can think of mixture models as generalizing k-means clustering to incorporate
information about the covariance structure of the data as well as the centers of the latent
Gaussians. The Gaussian Mixture node in SPSS Modeler exposes the core features and
commonly used parameters of the Gaussian Mixture library. The node is implemented in Python. |
|
Kernel Density Estimation (KDE)© uses the Ball Tree or KD Tree algorithms for
efficient queries, and combines concepts from unsupervised learning, feature engineering, and data
modeling. Neighbor-based approaches such as KDE are some of the most popular and useful density
estimation techniques. The KDE Modeling and KDE Simulation nodes in SPSS Modeler expose the core features and
commonly used parameters of the KDE library. The nodes are implemented in Python. |
|
The Random Forest node uses an advanced implementation of a bagging algorithm
with a tree model as the base model. This Random Forest modeling node in SPSS Modeler is implemented in Python and
requires the scikit-learn© Python library. |
|
Hierarchical Density-Based Spatial Clustering (HDBSCAN)© uses unsupervised
learning to find clusters, or dense regions, of a data set. The HDBSCAN node in SPSS Modeler exposes the core features and
commonly used parameters of the HDBSCAN library. The node is implemented in Python, and you can use
it to cluster your dataset into distinct groups when you don't know what those groups are at
first. |
|
The One-Class SVM node uses an unsupervised learning algorithm. The node can
be used for novelty detection. It will detect the soft boundary of a given set of samples, to then
classify new points as belonging to that set or not. This One-Class SVM modeling node in SPSS Modeler is implemented in Python and
requires the scikit-learn© Python library. |