scran
C++ library for basic single-cell RNA-seq analyses
|
Perform a simple PCA on a gene-cell matrix. More...
#include <SimplePca.hpp>
Classes | |
struct | Defaults |
Default parameter settings. More... | |
struct | Results |
Container for the PCA results. More... | |
Public Member Functions | |
SimplePca & | set_rank (int r=Defaults::rank) |
SimplePca & | set_scale (bool s=Defaults::scale) |
SimplePca & | set_transpose (bool t=Defaults::transpose) |
SimplePca & | set_return_rotation (bool r=Defaults::return_rotation) |
SimplePca & | set_return_center (bool r=Defaults::return_center) |
SimplePca & | set_return_scale (bool r=Defaults::return_scale) |
SimplePca & | set_num_threads (int n=Defaults::num_threads) |
template<typename T , typename IDX > | |
Results | run (const tatami::Matrix< T, IDX > *mat) const |
template<typename T , typename IDX , typename X > | |
Results | run (const tatami::Matrix< T, IDX > *mat, const X *features) const |
Perform a simple PCA on a gene-cell matrix.
Principal components analysis (PCA) is a helpful technique for data compression and denoising. The idea is that the earlier PCs capture most of the systematic biological variation while the later PCs capture random technical noise. Thus, we can reduce the size of the data and eliminate noise by only using the earlier PCs for further analyses. Most practitioners will keep the first 10-50 PCs, though the exact choice is fairly arbitrary. For speed, we use the CppIrlba package to perform an approximate PCA.
|
inline |
|
inline |
s | Should genes be scaled to unit variance? |
SimplePca
instance.
|
inline |
t | Should the PC matrix be transposed on output? If true , the output matrix is column-major with cells in the columns, which is compatible with downstream libscran steps. |
SimplePca
instance.
|
inline |
r | Should the rotation matrix be returned in the output? |
SimplePca
instance.
|
inline |
r | Should the center vector be returned in the output? |
SimplePca
instance.
|
inline |
r | Should the scale vector be returned in the output? |
SimplePca
instance.
|
inline |
n | Number of threads to use. |
SimplePca
instance.
|
inline |
Run PCA on an input gene-by-cell matrix.
T | Floating point type for the data. |
IDX | Integer type for the indices. |
[in] | mat | Pointer to the input matrix. Columns should contain cells while rows should contain genes. |
Results
object containing the PCs and the variance explained.
|
inline |
Run PCA on an input gene-by-cell matrix after filtering for genes of interest. We typically use the set of highly variable genes from ChooseHVGs
, with the aim being to improve computational efficiency and avoid random noise by removing lowly variable genes.
T | Floating point type for the data. |
IDX | Integer type for the indices. |
X | Integer type for the feature filter. |
[in] | mat | Pointer to the input matrix. Columns should contain cells while rows should contain genes. |
[in] | features | Pointer to an array of length equal to the number of genes. Each entry treated as a boolean specifying whether the corresponding genes should be used in the PCA. |
Results
object containing the PCs and the variance explained.