Visualization and diagnostics for cluster analysis in Python
Clustergram
The clustergram was later implemented in R by Tal Galili, who also gives a thorough explanation of the concept.
This is a Python translation of Tal’s script written for scikit-learn and RAPIDS cuML implementations of K-Means, Mini Batch K-Means and Gaussian Mixture Model (scikit-learn only) clustering, plus hierarchical/agglomerative clustering using SciPy. Alternatively, you can create clustergram using from_* constructors based on alternative clustering algorithms.
Getting started
You can install clustergram from conda
or pip
:
conda install clustergram -c conda-forge
pip install clustergram
In any case, you still need to install your selected backend
(scikit-learn
and scipy
or cuML
).
The example of clustergram on Palmer penguins dataset:
import seaborn
df = seaborn.load_dataset('penguins')
First we have to select numerical data and scale them.