kmeans
A C++ library for k-means
Loading...
Searching...
No Matches
Public Member Functions | List of all members
kmeans::RefineLloyd< Matrix_, Cluster_, Float_ > Class Template Reference

Implements the Lloyd algorithm for k-means clustering. More...

#include <RefineLloyd.hpp>

Inheritance diagram for kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >:
Inheritance graph
[legend]
Collaboration diagram for kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >:
Collaboration graph
[legend]

Public Member Functions

 RefineLloyd (RefineLloydOptions options)
 
 RefineLloyd ()=default
 
RefineLloydOptionsget_options ()
 
Details< Index_ > run (const Matrix_ &data, Cluster_ ncenters, Float_ *centers, Cluster_ *clusters) const
 

Detailed Description

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
class kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >

Implements the Lloyd algorithm for k-means clustering.

The Lloyd algorithm is the simplest k-means clustering algorithm, involving several iterations of batch assignments and center calculations. Specifically, we assign each observation to its closest cluster, and once all points are assigned, we recompute the cluster centroids. This is repeated until there are no reassignments or the maximum number of iterations is reached.

In the Details::status returned by run(), the status code is either 0 (success) or 2 (maximum iterations reached without convergence). Previous versions of the library would report a status code of 1 upon encountering an empty cluster, but these are now just ignored.

Template Parameters
Matrix_Matrix type for the input data. This should satisfy the MockMatrix contract.
Cluster_Integer type for the cluster assignments.
Float_Floating-point type for the centroids.
See also
Lloyd, S. P. (1982).
Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 128-137.

Constructor & Destructor Documentation

◆ RefineLloyd() [1/2]

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >::RefineLloyd ( RefineLloydOptions  options)
inline
Parameters
optionsFurther options to the Lloyd algorithm.

◆ RefineLloyd() [2/2]

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >::RefineLloyd ( )
default

Default constructor.

Member Function Documentation

◆ get_options()

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
RefineLloydOptions & kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >::get_options ( )
inline
Returns
Options for Lloyd clustering, to be modified prior to calling run().

◆ run()

template<typename Matrix_ = SimpleMatrix<double, int>, typename Cluster_ = int, typename Float_ = double>
Details< Index_ > kmeans::RefineLloyd< Matrix_, Cluster_, Float_ >::run ( const Matrix_ data,
Cluster_  num_centers,
Float_ centers,
Cluster_ clusters 
) const
inlinevirtual
Parameters
dataA matrix-like object (see MockMatrix) containing per-observation data.
num_centersNumber of cluster centers.
[in,out]centersPointer to an array of length equal to the product of num_centers and data.num_dimensions(). This contains a column-major matrix where rows correspond to dimensions and columns correspond to cluster centers. On input, each column should contain the initial centroid location for its cluster. On output, each column will contain the final centroid locations for each cluster.
[out]clustersPointer to an array of length equal to the number of observations (from data.num_observations()). On output, this will contain the cluster assignment for each observation.
Returns
centers and clusters are filled, and a Details object is returned containing clustering statistics. If num_centers is greater than data.num_observations(), only the first data.num_observations() columns of the centers array will be filled.

Implements kmeans::Refine< Matrix_, Cluster_, Float_ >.


The documentation for this class was generated from the following file: