Package smile.clustering
Class CLARANS<T>
- java.lang.Object
-
- smile.clustering.PartitionClustering<T>
-
- smile.clustering.CLARANS<T>
-
- Type Parameters:
T
- the type of input object.
- All Implemented Interfaces:
java.io.Serializable
,Clustering<T>
public class CLARANS<T> extends PartitionClustering<T>
Clustering Large Applications based upon RANdomized Search. CLARANS is an efficient medoid-based clustering algorithm. The k-medoids algorithm is an adaptation of the k-means algorithm. Rather than calculate the mean of the items in each cluster, a representative item, or medoid, is chosen for each cluster at each iteration. In CLARANS, the process of finding k medoids from n objects is viewed abstractly as searching through a certain graph. In the graph, a node is represented by a set of k objects as selected medoids. Two nodes are neighbors if their sets differ by only one object. In each iteration, CLARANS considers a set of randomly chosen neighbor nodes as candidate of new medoids. We will move to the neighbor node if the neighbor is a better choice for medoids. Otherwise, a local optima is discovered. The entire process is repeated multiple time to find better.CLARANS has two parameters: the maximum number of neighbors examined (maxNeighbor) and the number of local minima obtained (numLocal). The higher the value of maxNeighbor, the closer is CLARANS to PAM, and the longer is each search of a local minima. But the quality of such a local minima is higher and fewer local minima needs to be obtained.
References
- R. Ng and J. Han. CLARANS: A Method for Clustering Objects for Spatial Data Mining. IEEE TRANS. KNOWLEDGE AND DATA ENGINEERING, 2002.
- Author:
- Haifeng Li
- See Also:
- Serialized Form
-
-
Field Summary
-
Fields inherited from class smile.clustering.PartitionClustering
k, size, y
-
Fields inherited from interface smile.clustering.Clustering
OUTLIER
-
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
distortion()
Returns the distortion.int
getMaxNeighbor()
Returns the maximum number of neighbors examined during a search of local minima.int
getNumLocalMinima()
Returns the number of local minima to search for.T[]
medoids()
Returns the medoids.int
predict(T x)
Cluster a new instance.java.lang.String
toString()
-
Methods inherited from class smile.clustering.PartitionClustering
getClusterLabel, getClusterSize, getNumClusters, seed, seed
-
-
-
-
Constructor Detail
-
CLARANS
public CLARANS(T[] data, Distance<T> distance, int k)
Constructor. Clustering data into k clusters. The maximum number of random search is set to 0.02 * k * (n - k), where n is the number of data and k is the number clusters. The number of local searches is max(8, numProcessors).- Parameters:
data
- the dataset for clustering.distance
- the distance/dissimilarity measure.k
- the number of clusters.
-
CLARANS
public CLARANS(T[] data, Distance<T> distance, int k, int maxNeighbor)
Constructor. Clustering data into k clusters.- Parameters:
data
- the dataset for clustering.distance
- the distance/dissimilarity measure.k
- the number of clusters.maxNeighbor
- the maximum number of neighbors examined during a random search of local minima.
-
CLARANS
public CLARANS(T[] data, Distance<T> distance, int k, int maxNeighbor, int numLocal)
Constructor. Clustering data into k clusters.- Parameters:
data
- the dataset for clustering.distance
- the distance/dissimilarity measure.k
- the number of clusters.maxNeighbor
- the maximum number of neighbors examined during a random search of local minima.numLocal
- the number of local minima to search for.
-
-
Method Detail
-
getNumLocalMinima
public int getNumLocalMinima()
Returns the number of local minima to search for.
-
getMaxNeighbor
public int getMaxNeighbor()
Returns the maximum number of neighbors examined during a search of local minima.
-
distortion
public double distortion()
Returns the distortion.
-
medoids
public T[] medoids()
Returns the medoids.
-
predict
public int predict(T x)
Cluster a new instance.- Parameters:
x
- a new instance.- Returns:
- the cluster label, which is the index of nearest medoid.
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-