Isomap Dimensionality Reduction
Posted on Jun 14, 2018 in Notes • 11 min read
Isomap Dimensionality Reduction¶
An unsupervised learning technique that reduces the dimensionality of your datasets.
For datasets with non-linear related features.
Goal: Uncovers the intrinsic, geometric-nature of your dataset.
under the hood: calculates the distance of each sample to every other sample based on features and then only keeps the k-nearest samples per sample to include in the nearest-neighborhood samples list. Each sample only knows its K-nearest neighbors and how to get to them. A neighborhood graph is then constructed by linking each sample to its K-nearest neighbors. The result is similar to a map of roads in order to move from one point to other K-nearest points to eventually arrive at a destination.
- be sure to use feature scaling (SciKit-Learn's StandardScaler is a good-fit for taking care of scaling your data before performing dimensionality reduction.)
- irreversible, unidirectional transformation (not able to
.inverse_transform()
projected data back into the original feature space - slower than PCA
- a bit more sensitive to noise than PCA
from sklearn import manifold
# Create Isomap instance
iso = manifold.Isomap(n_neighbors=4, n_components=2)
# One-liner for the following three lines
T = iso.fit_transform(df)
# iso.fit(df)
# Isomap(eigen_solver='auto', max_iter=None, n_components=2, n_neighbors=4, neighbors_algorithm='auto', path_method='auto', tol=0)
# T = iso.transform(df)
n_components
: the number of features you want your dataset projected onton_neighbors
: the neighborhood size used to create the node of the neighborhood map.