scran
C++ library for basic single-cell RNA-seq analyses
Loading...
Searching...
No Matches
Classes | Public Member Functions | List of all members
scran::MedianSizeFactors Class Reference

Compute median-based size factors to handle composition bias. More...

#include <MedianSizeFactors.hpp>

Classes

struct  Defaults
 Default parameter settings. More...
 
struct  Results
 Result of the size factor calculation. More...
 

Public Member Functions

MedianSizeFactorsset_center (bool c=Defaults::center)
 
MedianSizeFactorsset_prior_count (double p=Defaults::prior_count)
 
MedianSizeFactorsset_num_threads (int n=Defaults::num_threads)
 
template<typename T , typename IDX , typename Ref , typename Out >
void run (const tatami::Matrix< T, IDX > *mat, const Ref *ref, Out *output) const
 
template<typename T , typename IDX , typename Out >
void run_with_mean (const tatami::Matrix< T, IDX > *mat, Out *output) const
 
template<typename Out = double, typename T , typename IDX , typename Ref >
Results< Out > run (const tatami::Matrix< T, IDX > *mat, const Ref *ref) const
 
template<typename Out = double, typename T , typename IDX >
Results< Out > run_with_mean (const tatami::Matrix< T, IDX > *mat) const
 

Detailed Description

Compute median-based size factors to handle composition bias.

This is roughly equivalent to the DESeq2-based approach where the size factor for each library is defined as the median ratio against a reference profile. The aim is to account for composition biases from differential expression between libraries, which would not be handled properly by library size normalization. The main differences from DESeq2 are:

In practice, this tends to work poorly for actual single-cell data due to its sparsity. Nonetheless, we provide it here because it can be helpful for removing composition biases between clusters based on their averaged pseudo-bulk profiles.

Member Function Documentation

◆ set_center()

MedianSizeFactors & scran::MedianSizeFactors::set_center ( bool  c = Defaults::center)
inline
Parameters
cWhether to center the size factors to have a mean of unity. This is usually desirable for interpretation of relative values.
Returns
A reference to this MedianSizeFactors object.

For more control over centering, this can be set to false and the resulting size factors can be passed to CenterSizeFactors.

◆ set_prior_count()

MedianSizeFactors & scran::MedianSizeFactors::set_prior_count ( double  p = Defaults::prior_count)
inline
Parameters
pPrior count to use for shrinking median-based size factors towards their library size-based counterparts. Larger values result in more shrinkage, while a value of zero will disable shrinkage altogether.
Returns
A reference to this MedianSizeFactors object.

When using shrinkage, we add a scaled version of the reference profile to each expression profile before computing the ratios. The scaling of the reference profile varies for each profile and is proportional to the (relative) total count of that profile. This implicitly pushes the median-based size factor towards a value that is proportional to the library size of the profile, given that the median of the ratio of the reference against a scaled version of itself is just the scaling factor, i.e., the library size.

The amount of shrinkage depends on the magnitude of the reference scaling. The prior count should be interpreted as the number of extra reads from the reference profile that is added to each profile. For example, the default of 10 means that the equivalent of 10 reads are added to each profile, distributed according to the reference profile. Increasing the prior count will increase the strength of the shrinkage as the reference profile has a greater contribution to the ratios.

◆ set_num_threads()

MedianSizeFactors & scran::MedianSizeFactors::set_num_threads ( int  n = Defaults::num_threads)
inline
Parameters
nNumber of threads to use.
Returns
A reference to this MedianSizeFactors object.

◆ run() [1/2]

template<typename T , typename IDX , typename Ref , typename Out >
void scran::MedianSizeFactors::run ( const tatami::Matrix< T, IDX > *  mat,
const Ref *  ref,
Out *  output 
) const
inline

Compute per-column size factors against a user-supplied reference profile.

Template Parameters
TNumeric data type of the input matrix.
IDXInteger index type of the input matrix.
RefNumeric data type of the reference profile.
OutNumeric data type of the output vector.
Parameters
matMatrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
[in]refPointer to an array containing the reference expression profile to normalize against. This should be of length equal to the number of rows in mat and should contain non-negative values.
[out]outputPointer to an array to use to store the output size factors. This should be of length equal to the number of columns in mat.

◆ run_with_mean() [1/2]

template<typename T , typename IDX , typename Out >
void scran::MedianSizeFactors::run_with_mean ( const tatami::Matrix< T, IDX > *  mat,
Out *  output 
) const
inline

Compute per-column size factors against an average pseudo-sample constructed from the row means of the input matrix.

Template Parameters
TNumeric data type of the input matrix.
IDXInteger index type of the input matrix.
OutNumeric data type of the output vector.
Parameters
matMatrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
[out]outputPointer to an array to use to store the output size factors. This should be of length equal to the number of columns in mat.

◆ run() [2/2]

template<typename Out = double, typename T , typename IDX , typename Ref >
Results< Out > scran::MedianSizeFactors::run ( const tatami::Matrix< T, IDX > *  mat,
const Ref *  ref 
) const
inline

Compute per-column size factors against a user-supplied reference profile.

Template Parameters
OutNumeric type for the size factors.
TNumeric data type of the input matrix.
IDXInteger index type of the input matrix.
RefNumeric data type of the reference profile.
Parameters
matMatrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
[in]refPointer to an array containing the reference expression profile to normalize against. This should be of length equal to the number of rows in mat and should contain non-negative values.
Returns
A Results containing the size factors for each column in mat.

◆ run_with_mean() [2/2]

template<typename Out = double, typename T , typename IDX >
Results< Out > scran::MedianSizeFactors::run_with_mean ( const tatami::Matrix< T, IDX > *  mat) const
inline

Compute per-column size factors against an average pseudo-sample constructed from the row means of the input matrix.

Template Parameters
OutNumeric type for the size factors.
TNumeric data type of the input matrix.
IDXInteger index type of the input matrix.
Parameters
matMatrix containing non-negative expression data, usually counts. Rows should be genes; columns may be cells, but are more typically some kind of aggregated pseudo-bulk profile.
Returns
A Results containing the size factors for each column in mat.

The documentation for this class was generated from the following file: