K-medoids clustering

K-medoids clustering (Kaufman & Rousseeuw 1990) can be compared to k-means clustering, and requires the user to select the number of clusters. Unlike k-means, the clusters are centred on a point in the data set, rather than a cluster mean. Also, importantly, k-medoids allows any distance measure to be used, making it useful for e.g. ecological and genetic  data.

The algorithm in Past follows the original PAM method described by Kaufman & Rousseeuw (1990).

Silhouette plot and table

The silhouette plot (Rousseeuw 1987) gives an indication of how well each object has been classified, on a scale from -1 to 1, where 1 means a perfectly appropriate assignment to a cluster; -1 means the object would have been better placed in another cluster; 0 means the object is on the boundary between two clusters.

References

Kaufman, L. & Rousseeuw, P.J. 1990. Partitioning around medoids (program PAM). Ch. 2 in Finding groups in data: An introduction to cluster analysis. John Wiley & Sons.

Rousseeuw, P.J. 1987. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis". Computational and Applied Mathematics 20:53–65.

Published Apr. 18, 2022 9:10 AM - Last modified Mar. 9, 2024 12:23 AM