Compute median-based size factors to handle composition bias. More...

#include <MedianSizeFactors.hpp>

Classes
struct	Defaults
	Default parameter settings. More...

struct	Results
	Result of the size factor calculation. More...

Public Member Functions
MedianSizeFactors &	set_center (bool c=Defaults::center)

MedianSizeFactors &	set_prior_count (double p=Defaults::prior_count)

MedianSizeFactors &	set_num_threads (int n=Defaults::num_threads)

template<typename T , typename IDX , typename Ref , typename Out >
void	run (const tatami::Matrix< T, IDX > mat, const Ref ref, Out *output) const

template<typename T , typename IDX , typename Out >
void	run_with_mean (const tatami::Matrix< T, IDX > mat, Out output) const

template<typename Out = double, typename T , typename IDX , typename Ref >
Results< Out >	run (const tatami::Matrix< T, IDX > mat, const Ref ref) const

template<typename Out = double, typename T , typename IDX >
Results< Out >	run_with_mean (const tatami::Matrix< T, IDX > *mat) const

Detailed Description

Compute median-based size factors to handle composition bias.

This is roughly equivalent to the DESeq2-based approach where the size factor for each library is defined as the median ratio against a reference profile. The aim is to account for composition biases from differential expression between libraries, which would not be handled properly by library size normalization. The main differences from DESeq2 are:

The row means are used as the default reference, instead of the geometric mean. This avoids problems with reference values of zero in sparse data.
The median-based size factors are slightly shrunk towards the library size-derived factors. This ensures that the reported factors are never zero.

In practice, this tends to work poorly for actual single-cell data due to its sparsity. Nonetheless, we provide it here because it can be helpful for removing composition biases between clusters based on their averaged pseudo-bulk profiles.

Member Function Documentation

◆ set_center()

MedianSizeFactors & scran::MedianSizeFactors::set_center ( bool c = Defaults::center )

inline

Parameters

c	Whether to center the size factors to have a mean of unity. This is usually desirable for interpretation of relative values.

Returns: A reference to this MedianSizeFactors object.

For more control over centering, this can be set to false and the resulting size factors can be passed to CenterSizeFactors.

◆ set_prior_count()

MedianSizeFactors & scran::MedianSizeFactors::set_prior_count ( double p = Defaults::prior_count )

inline

Parameters

p	Prior count to use for shrinking median-based size factors towards their library size-based counterparts. Larger values result in more shrinkage, while a value of zero will disable shrinkage altogether.

Returns: A reference to this MedianSizeFactors object.

When using shrinkage, we add a scaled version of the reference profile to each expression profile before computing the ratios. The scaling of the reference profile varies for each profile and is proportional to the (relative) total count of that profile. This implicitly pushes the median-based size factor towards a value that is proportional to the library size of the profile, given that the median of the ratio of the reference against a scaled version of itself is just the scaling factor, i.e., the library size.

The amount of shrinkage depends on the magnitude of the reference scaling. The prior count should be interpreted as the number of extra reads from the reference profile that is added to each profile. For example, the default of 10 means that the equivalent of 10 reads are added to each profile, distributed according to the reference profile. Increasing the prior count will increase the strength of the shrinkage as the reference profile has a greater contribution to the ratios.

◆ set_num_threads()

MedianSizeFactors & scran::MedianSizeFactors::set_num_threads ( int n = Defaults::num_threads )

inline

Parameters

n	Number of threads to use.

Returns: A reference to this MedianSizeFactors object.

◆ run() [1/2]

template<typename T , typename IDX , typename Ref , typename Out >

void scran::MedianSizeFactors::run	(	const tatami::Matrix< T, IDX > *	mat,
		const Ref *	ref,
		Out *	output
	)		const

inline

Compute per-column size factors against a user-supplied reference profile.

Template Parameters

T	Numeric data type of the input matrix.
IDX	Integer index type of the input matrix.
Ref	Numeric data type of the reference profile.
Out	Numeric data type of the output vector.

Parameters

	mat	Matrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
[in]	ref	Pointer to an array containing the reference expression profile to normalize against. This should be of length equal to the number of rows in `mat` and should contain non-negative values.
[out]	output	Pointer to an array to use to store the output size factors. This should be of length equal to the number of columns in `mat`.

◆ run_with_mean() [1/2]

template<typename T , typename IDX , typename Out >

void scran::MedianSizeFactors::run_with_mean	(	const tatami::Matrix< T, IDX > *	mat,
		Out *	output
	)		const

inline

Compute per-column size factors against an average pseudo-sample constructed from the row means of the input matrix.

Template Parameters

T	Numeric data type of the input matrix.
IDX	Integer index type of the input matrix.
Out	Numeric data type of the output vector.

Parameters

	mat	Matrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
[out]	output	Pointer to an array to use to store the output size factors. This should be of length equal to the number of columns in `mat`.

◆ run() [2/2]

template<typename Out = double, typename T , typename IDX , typename Ref >

Results< Out > scran::MedianSizeFactors::run	(	const tatami::Matrix< T, IDX > *	mat,
		const Ref *	ref
	)		const

inline

Compute per-column size factors against a user-supplied reference profile.

Template Parameters

Out	Numeric type for the size factors.
T	Numeric data type of the input matrix.
IDX	Integer index type of the input matrix.
Ref	Numeric data type of the reference profile.

Parameters

	mat	Matrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
[in]	ref	Pointer to an array containing the reference expression profile to normalize against. This should be of length equal to the number of rows in `mat` and should contain non-negative values.

Returns: A Results containing the size factors for each column in mat.

◆ run_with_mean() [2/2]

template<typename Out = double, typename T , typename IDX >

Results< Out > scran::MedianSizeFactors::run_with_mean ( const tatami::Matrix< T, IDX > * mat ) const

inline

Compute per-column size factors against an average pseudo-sample constructed from the row means of the input matrix.

Template Parameters

Out	Numeric type for the size factors.
T	Numeric data type of the input matrix.
IDX	Integer index type of the input matrix.

Parameters

mat	Matrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.

Returns: A Results containing the size factors for each column in mat.

The documentation for this class was generated from the following file:

scran/normalization/MedianSizeFactors.hpp

Classes

Public Member Functions

Detailed Description

Member Function Documentation

◆ set_center()

◆ set_prior_count()

◆ set_num_threads()

◆ run() [1/2]

◆ run_with_mean() [1/2]

◆ run() [2/2]

◆ run_with_mean() [2/2]