|
kmeans
k-means clustering in C++
|
Interface for k-means refinement algorithms. More...
#include <Refine.hpp>

Public Member Functions | |
| virtual Details< Index_ > | run (const Matrix_ &data, Cluster_ num_centers, Float_ *centers, Cluster_ *clusters) const =0 |
Interface for k-means refinement algorithms.
| Index_ | Integer type of the observation indices. This should be the same as the index type of Matrix_. |
| Data_ | Numeric type of the input dataset. This should be the same as the data type of Matrix_. |
| Cluster_ | Integer type of the cluster assignments. |
| Float_ | Floating-point type of the centroids. |
| Matrix_ | Class satisfying the Matrix interface. |
|
pure virtual |
| data | A matrix containing data for each observation. | |
| num_centers | Number of cluster centers. | |
| [in,out] | centers | Pointer to an array of length equal to the product of num_centers and data.num_dimensions(). This contains a column-major matrix where rows correspond to dimensions and columns correspond to cluster centers. On input, column j should contain the initial centroid location for cluster j. On output, each column will contain the final centroid location for its corresponding cluster. |
| [out] | clusters | Pointer to an array of length equal to the number of observations (from data.num_observations()). On output, this will contain the 0-based cluster assignment for each observation. Specifically, each entry is an index that refers to a column of centers and is no greater than num_centers. |
centers and clusters are filled, and an object is returned containing clustering statistics.Not all columns of centers may be represented in the output clusters, i.e., some clusters may be unused. The remove_unused_centers() function will rearrange the cluster assignments to more easily skip these empty clusters. In practice, empty clusters should be rare if the initial centroids are chosen appropriately.