template<typename Index_, typename Data_, typename Cluster_, typename Float_, class Matrix_ = Matrix<Index_, Data_>>
class kmeans::InitializeKmeanspp< Index_, Data_, Cluster_, Float_, Matrix_ >
k-means++ initialization of Arthur and Vassilvitskii (2007).
Selection of starting points is performed via iterations of weighted sampling, where the sampling probability for each point is proportional to the squared distance to the closest starting point that was chosen in any of the previous iterations. The aim is to obtain well-separated starting points to encourage the formation of suitable clusters.
- Template Parameters
-
Index_ | Integer type of the observation indices. This should be the same as the index type of Matrix_ . |
Data_ | Numeric type of the input dataset. This should be the same as the data type of Matrix_ . |
Cluster_ | Integer type of the cluster assignments. |
Float_ | Floating-point type of the centroids. |
Matrix_ | Class satisfying the Matrix interface. |
- See also
- Arthur, D. and Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 1027-1035.