kmeans
A C++ library for k-means
Loading...
Searching...
No Matches
Public Member Functions | List of all members
kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ > Class Template Reference

k-means++ initialization of Arthur and Vassilvitskii (2007). More...

#include <InitializeKmeanspp.hpp>

Inheritance diagram for kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >:
Inheritance graph
[legend]
Collaboration diagram for kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >:
Collaboration graph
[legend]

Public Member Functions

 InitializeKmeanspp (InitializeKmeansppOptions options)
 
 InitializeKmeanspp ()=default
 
InitializeKmeansppOptionsget_options ()
 
Cluster_ run (const Matrix_ &matrix, Cluster_ ncenters, Float_ *centers) const
 

Detailed Description

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
class kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >

k-means++ initialization of Arthur and Vassilvitskii (2007).

This approach involves the selection of starting points via iterations of weighted sampling, where the sampling probability for each point is proportional to the squared distance to the closest starting point that was chosen in any of the previous iterations. The aim is to obtain well-separated starting points to encourage the formation of suitable clusters.

Template Parameters
Matrix_Matrix type for the input data. This should satisfy the MockMatrix contract.
Cluster_Integer type for the cluster assignments.
Float_Floating-point type for the centroids.
See also
Arthur, D. and Vassilvitskii, S. (2007). k-means++: the advantages of careful seeding. Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, 1027-1035.

Constructor & Destructor Documentation

◆ InitializeKmeanspp() [1/2]

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >::InitializeKmeanspp ( InitializeKmeansppOptions  options)
inline
Parameters
optionsOptions for kmeans++ initialization.

◆ InitializeKmeanspp() [2/2]

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >::InitializeKmeanspp ( )
default

Default constructor.

Member Function Documentation

◆ get_options()

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
InitializeKmeansppOptions & kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >::get_options ( )
inline
Returns
Options for kmeans++ partitioning, to be modified prior to calling run().

◆ run()

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
Cluster_ kmeans::InitializeKmeanspp< Matrix_, Cluster_, Float_ >::run ( const Matrix_ data,
Cluster_  num_centers,
Float_ centers 
) const
inlinevirtual
Parameters
dataA matrix-like object (see MockMatrix) containing per-observation data.
num_centersNumber of cluster centers.
[out]centersPointer to an array of length equal to the product of num_centers and data.num_dimensions(). This contains a column-major matrix where rows correspond to dimensions and columns correspond to cluster centers. On output, each column will contain the final centroid locations for each cluster.
Returns
centers is filled with the new cluster centers. The number of filled centers is returned - this is usually equal to num_centers, but may not be if, e.g., num_centers is greater than the number of observations. If the returned value is less than num_centers, only the first few centers in centers will be filled.

Implements kmeans::Initialize< Matrix_, Cluster_, Float_ >.


The documentation for this class was generated from the following file: