t-SNE node
t-Distributed Stochastic Neighbor Embedding (t-SNE)© is a tool for visualizing
high-dimensional data. It converts affinities of data points to probabilities. The affinities in the
original space are represented by Gaussian joint probabilities and the affinities in the embedded
space are represented by Student's t-distributions. This allows t-SNE to be particularly sensitive
to local structure and has a few other advantages over existing techniques: 1
- Revealing the structure at many scales on a single map
- Revealing data that lie in multiple, different, manifolds, or clusters
- Reducing the tendency to crowd points together at the center
The t-SNE node in SPSS® Modeler is implemented in Python and
requires the scikit-learn© Python library. For details about t-SNE and the
scikit-learn library, see:
The Python tab on the Nodes Palette contains this node and other Python nodes. The t-SNE node is also available on the Graphs tab.
1 References:
van der Maaten, L.J.P.; Hinton, G. "Visualizing High-Dimensional Data using t-SNE." Journal of Machine Learning Research. 9:2579-2605, 2008.
van der Maaten, L.J.P. "t-Distributed Stochastic Neighbor Embedding."
van der Maaten, L.J.P. "Accelerating t-SNE using Tree-Based Algorithms." Journal of Machine Learning Research. 15(Oct):3221-3245, 2014.