Agglomerative Clustering

../../../../_images/agglomerative_clustering.svg

Recursively merges the pair of clusters that minimally increases a given linkage distance.

Documentation

Attributes

children_

The children of each non-leaf node. Values less than n_samples correspond to leaves of the tree which are the original samples. A node i greater than or equal to n_samples is a non-leaf node and has children children_[i - n_samples]. Alternatively at the i-th iteration, children[i] and children[i] are merged to form node n_samples + i.

labels_

Cluster labels for each point.

n_components

n_leaves_

Number of leaves in the hierarchical tree.

Definition

Output ports

model model

Model

Configuration

Affinity (affinity)

(no description)

linkage (linkage)

Which linkage criterion to use. The linkage criterion determines which distance to use between sets of observation. The algorithm will merge the pairs of cluster that minimize this criterion.

  • ‘ward’ minimizes the variance of the clusters being merged.

  • ‘average’ uses the average of the distances of each observation of the two sets.

  • ‘complete’ or ‘maximum’ linkage uses the maximum distances between all observations of the two sets.

  • ‘single’ uses the minimum of the distances between all observations of the two sets.

Added in version 0.20: Added the ‘single’ option

For examples comparing different linkage criteria, see sphx_glr_auto_examples_cluster_plot_linkage_comparison.py.

Number of clusters (n_clusters)

The number of clusters to find. It must be None if distance_threshold is not None.

Examples

Implementation

Some of the docstrings for this module have been automatically extracted from the scikit-learn library and are covered by their respective licenses.

class node_clustering2.AgglomerativeClustering[source]