scran
C++ library for basic single-cell RNA-seq analyses
Loading...
Searching...
No Matches
Classes | Public Member Functions | List of all members
scran::SimplePca Class Reference

Perform a simple PCA on a gene-cell matrix. More...

#include <SimplePca.hpp>

Classes

struct  Defaults
 Default parameter settings. More...
 
struct  Results
 Container for the PCA results. More...
 

Public Member Functions

SimplePcaset_rank (int r=Defaults::rank)
 
SimplePcaset_scale (bool s=Defaults::scale)
 
SimplePcaset_transpose (bool t=Defaults::transpose)
 
SimplePcaset_return_rotation (bool r=Defaults::return_rotation)
 
SimplePcaset_return_center (bool r=Defaults::return_center)
 
SimplePcaset_return_scale (bool r=Defaults::return_scale)
 
SimplePcaset_num_threads (int n=Defaults::num_threads)
 
template<typename T , typename IDX >
Results run (const tatami::Matrix< T, IDX > *mat) const
 
template<typename T , typename IDX , typename X >
Results run (const tatami::Matrix< T, IDX > *mat, const X *features) const
 

Detailed Description

Perform a simple PCA on a gene-cell matrix.

Principal components analysis (PCA) is a helpful technique for data compression and denoising. The idea is that the earlier PCs capture most of the systematic biological variation while the later PCs capture random technical noise. Thus, we can reduce the size of the data and eliminate noise by only using the earlier PCs for further analyses. Most practitioners will keep the first 10-50 PCs, though the exact choice is fairly arbitrary. For speed, we use the CppIrlba package to perform an approximate PCA.

Member Function Documentation

◆ set_rank()

SimplePca & scran::SimplePca::set_rank ( int  r = Defaults::rank)
inline
Parameters
rNumber of PCs to compute. This should be no greater than the maximum number of PCs, i.e., the smaller dimension of the input matrix; otherwise, only the maximum number of PCs will be reported in the Results.
Returns
A reference to this SimplePca instance.

◆ set_scale()

SimplePca & scran::SimplePca::set_scale ( bool  s = Defaults::scale)
inline
Parameters
sShould genes be scaled to unit variance?
Returns
A reference to this SimplePca instance.

◆ set_transpose()

SimplePca & scran::SimplePca::set_transpose ( bool  t = Defaults::transpose)
inline
Parameters
tShould the PC matrix be transposed on output? If true, the output matrix is column-major with cells in the columns, which is compatible with downstream libscran steps.
Returns
A reference to this SimplePca instance.

◆ set_return_rotation()

SimplePca & scran::SimplePca::set_return_rotation ( bool  r = Defaults::return_rotation)
inline
Parameters
rShould the rotation matrix be returned in the output?
Returns
A reference to this SimplePca instance.

◆ set_return_center()

SimplePca & scran::SimplePca::set_return_center ( bool  r = Defaults::return_center)
inline
Parameters
rShould the center vector be returned in the output?
Returns
A reference to this SimplePca instance.

◆ set_return_scale()

SimplePca & scran::SimplePca::set_return_scale ( bool  r = Defaults::return_scale)
inline
Parameters
rShould the scale vector be returned in the output?
Returns
A reference to this SimplePca instance.

◆ set_num_threads()

SimplePca & scran::SimplePca::set_num_threads ( int  n = Defaults::num_threads)
inline
Parameters
nNumber of threads to use.
Returns
A reference to this SimplePca instance.

◆ run() [1/2]

template<typename T , typename IDX >
Results scran::SimplePca::run ( const tatami::Matrix< T, IDX > *  mat) const
inline

Run PCA on an input gene-by-cell matrix.

Template Parameters
TFloating point type for the data.
IDXInteger type for the indices.
Parameters
[in]matPointer to the input matrix. Columns should contain cells while rows should contain genes.
Returns
A Results object containing the PCs and the variance explained.

◆ run() [2/2]

template<typename T , typename IDX , typename X >
Results scran::SimplePca::run ( const tatami::Matrix< T, IDX > *  mat,
const X *  features 
) const
inline

Run PCA on an input gene-by-cell matrix after filtering for genes of interest. We typically use the set of highly variable genes from ChooseHVGs, with the aim being to improve computational efficiency and avoid random noise by removing lowly variable genes.

Template Parameters
TFloating point type for the data.
IDXInteger type for the indices.
XInteger type for the feature filter.
Parameters
[in]matPointer to the input matrix. Columns should contain cells while rows should contain genes.
[in]featuresPointer to an array of length equal to the number of genes. Each entry treated as a boolean specifying whether the corresponding genes should be used in the PCA.
Returns
A Results object containing the PCs and the variance explained.

The documentation for this class was generated from the following file: