Class HierarchicalClustering


  • public class HierarchicalClustering
    extends java.lang.Object
    Agglomerative Hierarchical Clustering. Hierarchical agglomerative clustering seeks to build a hierarchy of clusters in a bottom up approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy. The results of hierarchical clustering are usually presented in a dendrogram.

    In general, the merges are determined in a greedy manner. In order to decide which clusters should be combined, a measure of dissimilarity between sets of observations is required. In most methods of hierarchical clustering, this is achieved by use of an appropriate metric, and a linkage criteria which specifies the dissimilarity of sets as a function of the pairwise distances of observations in the sets.

    Hierarchical clustering has the distinct advantage that any valid measure of distance can be used. In fact, the observations themselves are not required: all that is used is a matrix of distances.

    References

    1. David Eppstein. Fast hierarchical clustering and other applications of dynamic closest pairs. SODA 1998.
    Author:
    Haifeng Li
    See Also:
    Linkage
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      double[] getHeight()
      Returns a set of n-1 non-decreasing real values, which are the clustering height, i.e., the value of the criterion associated with the clustering method for the particular agglomeration.
      int[][] getTree()
      Returns an n-1 by 2 matrix of which row i describes the merging of clusters at step i of the clustering.
      int[] partition​(double h)
      Cuts a tree into several groups by specifying the cut height.
      int[] partition​(int k)
      Cuts a tree into several groups by specifying the desired number.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • HierarchicalClustering

        public HierarchicalClustering​(Linkage linkage)
        Constructor. Learn the Agglomerative Hierarchical Clustering with given linkage method, which includes proximity matrix.
        Parameters:
        linkage - a linkage method to merge clusters. The linkage object includes the proximity matrix of data.
    • Method Detail

      • getTree

        public int[][] getTree()
        Returns an n-1 by 2 matrix of which row i describes the merging of clusters at step i of the clustering. If an element j in the row is less than n, then observation j was merged at this stage. If j ≥ n then the merge was with the cluster formed at the (earlier) stage j-n of the algorithm.
      • getHeight

        public double[] getHeight()
        Returns a set of n-1 non-decreasing real values, which are the clustering height, i.e., the value of the criterion associated with the clustering method for the particular agglomeration.
      • partition

        public int[] partition​(int k)
        Cuts a tree into several groups by specifying the desired number.
        Parameters:
        k - the number of clusters.
        Returns:
        the cluster label of each sample.
      • partition

        public int[] partition​(double h)
        Cuts a tree into several groups by specifying the cut height.
        Parameters:
        h - the cut height.
        Returns:
        the cluster label of each sample.