template<typename Index_, typename Data_, typename Cluster_, typename Float_, class Matrix_ = Matrix<Index_, Data_>>
class kmeans::InitializeKmeanspp< Index_, Data_, Cluster_, Float_, Matrix_ >
k-means++ initialization of Arthur and Vassilvitskii (2007).
This approach involves the selection of starting points via iterations of weighted sampling, where the sampling probability for each point is proportional to the squared distance to the closest starting point that was chosen in any of the previous iterations. The aim is to obtain well-separated starting points to encourage the formation of suitable clusters.
- Template Parameters
-
Index_ | Integer type for the observation indices in the input dataset. |
Data_ | Numeric type for the input dataset. |
Cluster_ | Integer type for the cluster assignments. |
Float_ | Floating-point type for the centroids. |
Matrix_ | Class of the input data matrix. This should satisfy the Matrix interface. |
- See also
- Arthur, D. and Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 1027-1035.