Isomap Dimensionality Reduction

Posted on Jun 14, 2018 in Notes • 11 min read

Isomap Dimensionality Reduction

An unsupervised learning technique that reduces the dimensionality of your datasets.
For datasets with non-linear related features.
Goal: Uncovers the intrinsic, geometric-nature of your dataset.

under the hood: calculates the distance of each sample to every other sample based on features and then only keeps the k-nearest samples per sample to include in the nearest-neighborhood samples list. Each sample only knows its K-nearest neighbors and how to get to them. A neighborhood graph is then constructed by linking each sample to its K-nearest neighbors. The result is similar to a map of roads in order to move from one point to other K-nearest points to eventually arrive at a destination.

  • be sure to use feature scaling (SciKit-Learn's StandardScaler is a good-fit for taking care of scaling your data before performing dimensionality reduction.)
  • irreversible, unidirectional transformation (not able to .inverse_transform() projected data back into the original feature space
  • slower than PCA
  • a bit more sensitive to noise than PCA
In [ ]:
from sklearn import manifold

# Create Isomap instance
iso = manifold.Isomap(n_neighbors=4, n_components=2)

# One-liner for the following three lines
T = iso.fit_transform(df)

# iso.fit(df)
# Isomap(eigen_solver='auto', max_iter=None, n_components=2, n_neighbors=4, neighbors_algorithm='auto', path_method='auto', tol=0)
# T = iso.transform(df)
  • n_components: the number of features you want your dataset projected onto
  • n_neighbors: the neighborhood size used to create the node of the neighborhood map.