Compute log-normalized expression values. More...

#include <LogNormCounts.hpp>

Classes
struct	Defaults
	Default parameter settings. More...

Public Member Functions
LogNormCounts &	set_pseudo_count (double p=Defaults::pseudo_count)

LogNormCounts &	set_sparse_addition (bool a=Defaults::sparse_addition)

LogNormCounts &	set_center (bool c=Defaults::center)

LogNormCounts &	set_block_mode (CenterSizeFactors::BlockMode b=CenterSizeFactors::Defaults::block_mode)

LogNormCounts &	set_handle_zeros (bool z=Defaults::handle_zeros)

LogNormCounts &	set_handle_non_finite (bool n=Defaults::handle_non_finite)

LogNormCounts &	set_num_threads (int n=Defaults::num_threads)

LogNormCounts &	set_choose_pseudo_count (bool c=Defaults::choose_pseudo_count)

LogNormCounts &	set_max_bias (double m=ChoosePseudoCount::Defaults::max_bias)

LogNormCounts &	set_quantile (double q=ChoosePseudoCount::Defaults::quantile)

LogNormCounts &	set_min_value (double m=ChoosePseudoCount::Defaults::min_value)

template<class MAT , class V >
std::shared_ptr< MAT >	run (std::shared_ptr< MAT > mat, V size_factors) const

template<class MAT , class V , typename B >
std::shared_ptr< MAT >	run_blocked (std::shared_ptr< MAT > mat, V size_factors, const B *block) const

template<class MAT >
std::shared_ptr< MAT >	run (std::shared_ptr< MAT > mat) const

template<class MAT , typename B >
std::shared_ptr< MAT >	run_blocked (std::shared_ptr< MAT > mat, const B *block) const

Detailed Description

Compute log-normalized expression values.

Given a count matrix and a set of size factors, compute log-transformed normalized expression values. Each cell's counts are divided by the cell's size factor, to account for differences in capture efficiency and sequencing depth across cells. The normalized values are then log-transformed so that downstream analyses focus on the relative rather than absolute differences in expression; this process also provides some measure of variance stabilization. These operations are done in a delayed manner using the DelayedIsometricOp class from the tatami library.

Member Function Documentation

◆ set_pseudo_count()

LogNormCounts & scran::LogNormCounts::set_pseudo_count ( double p = Defaults::pseudo_count )

inline

Set the pseudo-count to add to the normalized expression values prior to the log-transformation. Larger pseudo-counts will shrink the log-expression values towards zero such that the dataset variance is driven more by high-abundance genes; this is occasionally useful to mitigate biases introduced by log-expression at low counts. See also set_choose_pseudo_count().

Parameters

p	Pseudo-count, should be a positive number.

Returns: A reference to this LogNormCounts object.

◆ set_sparse_addition()

LogNormCounts & scran::LogNormCounts::set_sparse_addition ( bool a = Defaults::sparse_addition )

inline

Naive addition of a non-unity pseudo-count will break sparsity. This can be avoided by instead dividing the normalized expression values by the pseudo-count and then applying the usual log1p transformation. However, the resulting values can not be interpreted on the scale of log-counts.

Parameters

a	Whether to use an effective pseudo-count that avoids breaking sparsity.

Returns: A reference to this LogNormCounts object.

◆ set_center()

LogNormCounts & scran::LogNormCounts::set_center ( bool c = Defaults::center )

inline

Specify whether to center the size factors in run(). If true, we center the size factors across cells so that their average is equal to 1; this ensures that the normalized values can still be interpreted on the same scale as the input counts.

If false, no further centering is performed. This is more efficient when size factors are already centered; it may also be useful for re-using this class to compute other normalized values like log-CPMs.

Parameters

c	Whether to center the size factors.

Returns: A reference to this LogNormCounts object.

◆ set_block_mode()

LogNormCounts & scran::LogNormCounts::set_block_mode ( CenterSizeFactors::BlockMode b = CenterSizeFactors::Defaults::block_mode )

inline

Parameters

b	Blocking mode, see `CenterSizeFactors::set_block_mode()` for details.

Returns: A reference to this LogNormCounts object.

◆ set_handle_zeros()

LogNormCounts & scran::LogNormCounts::set_handle_zeros ( bool z = Defaults::handle_zeros )

inline

Specify whether to handle zero size factors. If false, size factors of zero will raise an error; otherwise, they will be automatically set to the smallest non-zero size factor after centering (or 1, if all size factors are zero). Setting this to true ensures that any all-zero cells are represented by all-zero columns in the normalized matrix, which is a reasonable outcome if those cells cannot be filtered out during upstream quality control. Note that the centering process ignores zeros, see CenterSizeFactors::set_ignore_zeros() for more details.

Parameters

z	Whether to replace zero size factors with the smallest non-zero size factor.

Returns: A reference to this LogNormCounts object.

◆ set_handle_non_finite()

LogNormCounts & scran::LogNormCounts::set_handle_non_finite ( bool n = Defaults::handle_non_finite )

inline

Specify whether to handle non-finite size factors. If false, non-finite size factors will raise an error. Otherwise, size factors of infinity will be automatically set to the largest finite size factor after centering (or 1, if all size factors are non-finite). Missing (i.e., NaN) size factors will be automatically set to 1 so that scaling is a no-op. Note that the centering process ignores non-finite factors, see CenterSizeFactors for more details.

Parameters

z	Whether to replace non-finite size factors with the largest finite size factor.

Returns: A reference to this LogNormCounts object.

◆ set_num_threads()

LogNormCounts & scran::LogNormCounts::set_num_threads ( int n = Defaults::num_threads )

inline

Parameters

n	Number of threads to use.

Returns: A reference to this LogNormCounts object.

Parallelization is only performed to compute size factors, so this method only has an effect if size_factors are not passed to run().

◆ set_choose_pseudo_count()

LogNormCounts & scran::LogNormCounts::set_choose_pseudo_count ( bool c = Defaults::choose_pseudo_count )

inline

Parameters

c	Whether to automatically choose an appropriate pseudo-count based on the (centered) size factors. See `ChoosePseudoCount` for details.

Returns: A reference to this LogNormCounts object.

◆ set_max_bias()

LogNormCounts & scran::LogNormCounts::set_max_bias ( double m = ChoosePseudoCount::Defaults::max_bias )

inline

Parameters

m	See `ChoosePseudoCount::set_max_bias()` for details.

Returns: A reference to this LogNormCounts object.

◆ set_quantile()

LogNormCounts & scran::LogNormCounts::set_quantile ( double q = ChoosePseudoCount::Defaults::quantile )

inline

Parameters

q	See `ChoosePseudoCount::set_quantile()` for details.

Returns: A reference to this LogNormCounts object.

◆ set_min_value()

LogNormCounts & scran::LogNormCounts::set_min_value ( double m = ChoosePseudoCount::Defaults::min_value )

inline

Parameters

m	See `ChoosePseudoCount::set_min_value()` for details.

Returns: A reference to this LogNormCounts object.

◆ run() [1/2]

template<class MAT , class V >

std::shared_ptr< MAT > scran::LogNormCounts::run	(	std::shared_ptr< MAT >	mat,
		V	size_factors
	)		const

inline

Compute log-normalized expression values from an input matrix. To avoid copying the data, this is done in a delayed manner using the DelayedIsometricOp class from the tatami package.

Template Parameters

MAT	A tatami matrix class, most typically a `tatami::NumericMatrix`.
V	A vector class supporting `size()`, random access via `[`, `begin()`, `end()` and `data()`.

Parameters

mat	Pointer to an input count matrix, with features in the rows and cells in the columns.
size_factors	A vector of positive size factors, of length equal to the number of columns in `mat`.

Returns: A pointer to a matrix of log-transformed and normalized values.

◆ run_blocked() [1/2]

template<class MAT , class V , typename B >

std::shared_ptr< MAT > scran::LogNormCounts::run_blocked	(	std::shared_ptr< MAT >	mat,
		V	size_factors,
		const B *	block
	)		const

inline

Compute log-normalized expression values from an input matrix with blocking. Specifically, centering of size factors is performed within each block. This allows users to easily mimic normalization of different blocks of cells (e.g., from different samples) in the same matrix.

Template Parameters

MAT	A tatami matrix class, most typically a `tatami::NumericMatrix`.
V	A vector class supporting `size()`, random access via `[`, `begin()`, `end()` and `data()`.
B	An integer type, to hold the block IDs.

Parameters

	mat	Pointer to an input count matrix, with features in the rows and cells in the columns.
	size_factors	A vector of size factors, of length equal to the number of columns in `mat`.
[in]	block	Pointer to an array of block identifiers. If provided, the array should be of length equal to the number of columns in `mat`. Values should be integer IDs in where is the number of blocks. This can also be a `NULL`, in which case all cells are assumed to belong to the same block.

Returns: A pointer to a matrix of log-transformed and normalized values.

◆ run() [2/2]

template<class MAT >

std::shared_ptr< MAT > scran::LogNormCounts::run ( std::shared_ptr< MAT > mat ) const

inline

Compute log-normalized expression values from an input matrix. Size factors are defined as the sum of the total counts for each cell.

Template Parameters

MAT	A tatami matrix class, most typically a `tatami::NumericMatrix`.

Parameters

mat	Pointer to an input count matrix, with features in the rows and cells in the columns.

Returns: A pointer to a matrix of log-transformed and normalized values.

◆ run_blocked() [2/2]

template<class MAT , typename B >

std::shared_ptr< MAT > scran::LogNormCounts::run_blocked	(	std::shared_ptr< MAT >	mat,
		const B *	block
	)		const

inline

Compute log-normalized expression values from an input matrix with blocking, see run_blocked() for details. Size factors are defined as the sum of the total counts for each cell.

Template Parameters

MAT	A tatami matrix class, most typically a `tatami::NumericMatrix`.
B	An integer type, to hold the block IDs.

Parameters

	mat	Pointer to an input count matrix, with features in the rows and cells in the columns.
[in]	block	Pointer to an array of block identifiers, see `run_blocked()` for details.

Returns: A pointer to a matrix of log-transformed and normalized values.

The documentation for this class was generated from the following file:

scran/normalization/LogNormCounts.hpp

Classes

Public Member Functions

Detailed Description

Member Function Documentation

◆ set_pseudo_count()

◆ set_sparse_addition()

◆ set_center()

◆ set_block_mode()

◆ set_handle_zeros()

◆ set_handle_non_finite()

◆ set_num_threads()

◆ set_choose_pseudo_count()

◆ set_max_bias()

◆ set_quantile()

◆ set_min_value()

◆ run() [1/2]

◆ run_blocked() [1/2]

◆ run() [2/2]

◆ run_blocked() [2/2]