scran
C++ library for basic single-cell RNA-seq analyses
Loading...
Searching...
No Matches
Classes | Public Member Functions | List of all members
scran::LogNormCounts Class Reference

Compute log-normalized expression values. More...

#include <LogNormCounts.hpp>

Classes

struct  Defaults
 Default parameter settings. More...
 

Public Member Functions

LogNormCountsset_pseudo_count (double p=Defaults::pseudo_count)
 
LogNormCountsset_sparse_addition (bool a=Defaults::sparse_addition)
 
LogNormCountsset_center (bool c=Defaults::center)
 
LogNormCountsset_block_mode (CenterSizeFactors::BlockMode b=CenterSizeFactors::Defaults::block_mode)
 
LogNormCountsset_handle_zeros (bool z=Defaults::handle_zeros)
 
LogNormCountsset_handle_non_finite (bool n=Defaults::handle_non_finite)
 
LogNormCountsset_num_threads (int n=Defaults::num_threads)
 
LogNormCountsset_choose_pseudo_count (bool c=Defaults::choose_pseudo_count)
 
LogNormCountsset_max_bias (double m=ChoosePseudoCount::Defaults::max_bias)
 
LogNormCountsset_quantile (double q=ChoosePseudoCount::Defaults::quantile)
 
LogNormCountsset_min_value (double m=ChoosePseudoCount::Defaults::min_value)
 
template<class MAT , class V >
std::shared_ptr< MAT > run (std::shared_ptr< MAT > mat, V size_factors) const
 
template<class MAT , class V , typename B >
std::shared_ptr< MAT > run_blocked (std::shared_ptr< MAT > mat, V size_factors, const B *block) const
 
template<class MAT >
std::shared_ptr< MAT > run (std::shared_ptr< MAT > mat) const
 
template<class MAT , typename B >
std::shared_ptr< MAT > run_blocked (std::shared_ptr< MAT > mat, const B *block) const
 

Detailed Description

Compute log-normalized expression values.

Given a count matrix and a set of size factors, compute log-transformed normalized expression values. Each cell's counts are divided by the cell's size factor, to account for differences in capture efficiency and sequencing depth across cells. The normalized values are then log-transformed so that downstream analyses focus on the relative rather than absolute differences in expression; this process also provides some measure of variance stabilization. These operations are done in a delayed manner using the DelayedIsometricOp class from the tatami library.

Member Function Documentation

◆ set_pseudo_count()

LogNormCounts & scran::LogNormCounts::set_pseudo_count ( double  p = Defaults::pseudo_count)
inline

Set the pseudo-count to add to the normalized expression values prior to the log-transformation. Larger pseudo-counts will shrink the log-expression values towards zero such that the dataset variance is driven more by high-abundance genes; this is occasionally useful to mitigate biases introduced by log-expression at low counts. See also set_choose_pseudo_count().

Parameters
pPseudo-count, should be a positive number.
Returns
A reference to this LogNormCounts object.

◆ set_sparse_addition()

LogNormCounts & scran::LogNormCounts::set_sparse_addition ( bool  a = Defaults::sparse_addition)
inline

Naive addition of a non-unity pseudo-count will break sparsity. This can be avoided by instead dividing the normalized expression values by the pseudo-count and then applying the usual log1p transformation. However, the resulting values can not be interpreted on the scale of log-counts.

Parameters
aWhether to use an effective pseudo-count that avoids breaking sparsity.
Returns
A reference to this LogNormCounts object.

◆ set_center()

LogNormCounts & scran::LogNormCounts::set_center ( bool  c = Defaults::center)
inline

Specify whether to center the size factors in run(). If true, we center the size factors across cells so that their average is equal to 1; this ensures that the normalized values can still be interpreted on the same scale as the input counts.

If false, no further centering is performed. This is more efficient when size factors are already centered; it may also be useful for re-using this class to compute other normalized values like log-CPMs.

Parameters
cWhether to center the size factors.
Returns
A reference to this LogNormCounts object.

◆ set_block_mode()

LogNormCounts & scran::LogNormCounts::set_block_mode ( CenterSizeFactors::BlockMode  b = CenterSizeFactors::Defaults::block_mode)
inline
Parameters
bBlocking mode, see CenterSizeFactors::set_block_mode() for details.
Returns
A reference to this LogNormCounts object.

◆ set_handle_zeros()

LogNormCounts & scran::LogNormCounts::set_handle_zeros ( bool  z = Defaults::handle_zeros)
inline

Specify whether to handle zero size factors. If false, size factors of zero will raise an error; otherwise, they will be automatically set to the smallest non-zero size factor after centering (or 1, if all size factors are zero). Setting this to true ensures that any all-zero cells are represented by all-zero columns in the normalized matrix, which is a reasonable outcome if those cells cannot be filtered out during upstream quality control. Note that the centering process ignores zeros, see CenterSizeFactors::set_ignore_zeros() for more details.

Parameters
zWhether to replace zero size factors with the smallest non-zero size factor.
Returns
A reference to this LogNormCounts object.

◆ set_handle_non_finite()

LogNormCounts & scran::LogNormCounts::set_handle_non_finite ( bool  n = Defaults::handle_non_finite)
inline

Specify whether to handle non-finite size factors. If false, non-finite size factors will raise an error. Otherwise, size factors of infinity will be automatically set to the largest finite size factor after centering (or 1, if all size factors are non-finite). Missing (i.e., NaN) size factors will be automatically set to 1 so that scaling is a no-op. Note that the centering process ignores non-finite factors, see CenterSizeFactors for more details.

Parameters
zWhether to replace non-finite size factors with the largest finite size factor.
Returns
A reference to this LogNormCounts object.

◆ set_num_threads()

LogNormCounts & scran::LogNormCounts::set_num_threads ( int  n = Defaults::num_threads)
inline
Parameters
nNumber of threads to use.
Returns
A reference to this LogNormCounts object.

Parallelization is only performed to compute size factors, so this method only has an effect if size_factors are not passed to run().

◆ set_choose_pseudo_count()

LogNormCounts & scran::LogNormCounts::set_choose_pseudo_count ( bool  c = Defaults::choose_pseudo_count)
inline
Parameters
cWhether to automatically choose an appropriate pseudo-count based on the (centered) size factors. See ChoosePseudoCount for details.
Returns
A reference to this LogNormCounts object.

◆ set_max_bias()

LogNormCounts & scran::LogNormCounts::set_max_bias ( double  m = ChoosePseudoCount::Defaults::max_bias)
inline
Parameters
mSee ChoosePseudoCount::set_max_bias() for details.
Returns
A reference to this LogNormCounts object.

◆ set_quantile()

LogNormCounts & scran::LogNormCounts::set_quantile ( double  q = ChoosePseudoCount::Defaults::quantile)
inline
Parameters
qSee ChoosePseudoCount::set_quantile() for details.
Returns
A reference to this LogNormCounts object.

◆ set_min_value()

LogNormCounts & scran::LogNormCounts::set_min_value ( double  m = ChoosePseudoCount::Defaults::min_value)
inline
Parameters
mSee ChoosePseudoCount::set_min_value() for details.
Returns
A reference to this LogNormCounts object.

◆ run() [1/2]

template<class MAT , class V >
std::shared_ptr< MAT > scran::LogNormCounts::run ( std::shared_ptr< MAT >  mat,
size_factors 
) const
inline

Compute log-normalized expression values from an input matrix. To avoid copying the data, this is done in a delayed manner using the DelayedIsometricOp class from the tatami package.

Template Parameters
MATA tatami matrix class, most typically a tatami::NumericMatrix.
VA vector class supporting size(), random access via [, begin(), end() and data().
Parameters
matPointer to an input count matrix, with features in the rows and cells in the columns.
size_factorsA vector of positive size factors, of length equal to the number of columns in mat.
Returns
A pointer to a matrix of log-transformed and normalized values.

◆ run_blocked() [1/2]

template<class MAT , class V , typename B >
std::shared_ptr< MAT > scran::LogNormCounts::run_blocked ( std::shared_ptr< MAT >  mat,
size_factors,
const B *  block 
) const
inline

Compute log-normalized expression values from an input matrix with blocking. Specifically, centering of size factors is performed within each block. This allows users to easily mimic normalization of different blocks of cells (e.g., from different samples) in the same matrix.

Template Parameters
MATA tatami matrix class, most typically a tatami::NumericMatrix.
VA vector class supporting size(), random access via [, begin(), end() and data().
BAn integer type, to hold the block IDs.
Parameters
matPointer to an input count matrix, with features in the rows and cells in the columns.
size_factorsA vector of size factors, of length equal to the number of columns in mat.
[in]blockPointer to an array of block identifiers. If provided, the array should be of length equal to the number of columns in mat. Values should be integer IDs in $[0, N)$ where $N$ is the number of blocks. This can also be a NULL, in which case all cells are assumed to belong to the same block.
Returns
A pointer to a matrix of log-transformed and normalized values.

◆ run() [2/2]

template<class MAT >
std::shared_ptr< MAT > scran::LogNormCounts::run ( std::shared_ptr< MAT >  mat) const
inline

Compute log-normalized expression values from an input matrix. Size factors are defined as the sum of the total counts for each cell.

Template Parameters
MATA tatami matrix class, most typically a tatami::NumericMatrix.
Parameters
matPointer to an input count matrix, with features in the rows and cells in the columns.
Returns
A pointer to a matrix of log-transformed and normalized values.

◆ run_blocked() [2/2]

template<class MAT , typename B >
std::shared_ptr< MAT > scran::LogNormCounts::run_blocked ( std::shared_ptr< MAT >  mat,
const B *  block 
) const
inline

Compute log-normalized expression values from an input matrix with blocking, see run_blocked() for details. Size factors are defined as the sum of the total counts for each cell.

Template Parameters
MATA tatami matrix class, most typically a tatami::NumericMatrix.
BAn integer type, to hold the block IDs.
Parameters
matPointer to an input count matrix, with features in the rows and cells in the columns.
[in]blockPointer to an array of block identifiers, see run_blocked() for details.
Returns
A pointer to a matrix of log-transformed and normalized values.

The documentation for this class was generated from the following file: