irlba
A C++ library for IRLBA
|
Sparse matrix with customizable parallelization. More...
#include <parallel.hpp>
Public Types | |
typedef std::remove_const< typenamestd::remove_reference< decltype(std::declval< PointerArray_ >()[0])>::type ::type | PointerType |
Public Member Functions | |
ParallelSparseMatrix () | |
ParallelSparseMatrix (Eigen::Index nrow, Eigen::Index ncol, ValueArray_ x, IndexArray_ i, PointerArray_ p, bool column_major, int nthreads) | |
Eigen::Index | rows () const |
Eigen::Index | cols () const |
const ValueArray_ & | get_values () const |
const IndexArray_ & | get_indices () const |
const PointerArray_ & | get_pointers () const |
const std::vector< size_t > & | get_primary_starts () const |
const std::vector< size_t > & | get_primary_ends () const |
const std::vector< std::vector< PointerType > > & | get_secondary_nonzero_starts () const |
Sparse matrix with customizable parallelization.
This provides an alternative to Eigen::SparseMatrix
for parallelized multiplication of compressed sparse matrices. Unlike Eigen, this implementation is able to parallelize when the multiplication does not align well with the storage layout, e.g., multiplication of a compressed sparse column matrix by a dense vector on the right hand side. On construction, it also pre-allocates the rows and/or columns to each thread, aiming to balance the number of non-zero elements that each thread needs to process. All subsequent multiplications can then use these allocations, which is useful for compute()
where the cost of pre-allocation is abrogated by repeated multiplication calls.
Some cursory testing indicates that the performance of this implementation is comparable to Eigen for OpenMP-based parallelization. However, the real purpose of this class is to support custom parallelization schemes in cases where OpenMP is not available. This is achieved by defining IRLBA_CUSTOM_PARALLEL
macro to the name of a function implementing a custom scheme. Such a function should accept two arguments - an integer specifying the number of threads, and a lambda that accepts a thread number. It should then loop over the number of threads and launch one job for each thread via the lambda. Once all threads are complete, the function should return.
This class satisfies the MockMatrix
interface and implements all of its methods/typedefs.
ValueArray_ | Array class containing numeric values for the non-zero values. Should support a read-only [] operator. |
IndexArray_ | Array class containing integer values for the indices of the non-zero values. Should support a read-only [] operator. |
PointerArray_ | Array class containing integer values for the pointers to the row/column boundaries. Should support a read-only [] operator. |
EigenVector_ | A floating-point Eigen::Vector class. |
typedef std::remove_const<typenamestd::remove_reference<decltype(std::declval<PointerArray_>()[0])>::type ::type irlba::ParallelSparseMatrix< ValueArray_, IndexArray_, PointerArray_, EigenVector_ >::PointerType |
Type of the elements inside a PointerArray_
.
|
inline |
Default constructor. This object cannot be used for any operations.
|
inline |
nrow | Number of rows. |
ncol | Number of columns. |
x | Values of non-zero elements. |
i | Indices of non-zero elements. Each entry corresponds to a value in x , so i should be an array of length equal to x . If column_major = true , i should contain row indices; otherwise it should contain column indices. |
p | Pointers to the start of each column (if column_major = true ) or row (otherwise). This should be an ordered array of length equal to the number of columns or rows plus 1. |
column_major | Whether the matrix should be in compressed sparse column format. If false , this is assumed to be in row-major format. |
nthreads | Number of threads to be used for multiplication. |
x
, i
and p
represent the typical components of a compressed sparse column/row matrix. Thus, entries in i
should be sorted within each column/row, where the boundaries between columns/rows are defined by p
.
|
inline |
|
inline |
i
in the constructor. These are row or column indices for compressed sparse row or column format, respectively, depending on column_major
.
|
inline |
p
in the constructor.
|
inline |
This should only be called if nthreads > 1
in the constructor, otherwise it will not be initialized.
column_major = true
) that each thread works on.
|
inline |
This should only be called if nthreads > 1
in the constructor, otherwise it will not be initialized. bool my_column_major;
column_major = true
) that each thread works on.
|
inline |
This should only be called if nthreads > 1
in the constructor, otherwise it will not be initialized.
column_major = true
). For thread i
, the vectors i
and i + 1
define the ranges of non-zero elements assigned to that thread within each primary dimension. This is guaranteed to contain all and only non-zero elements with indices in a contiguous range of secondary dimensions.
|
inline |
x
in the constructor.
|
inline |