scran
C++ library for basic single-cell RNA-seq analyses
Loading...
Searching...
No Matches
Classes | Public Member Functions | List of all members
scran::SuggestRnaQcFilters Class Reference

Create filters to identify low-quality cells from RNA-derived QC metrics. More...

#include <SuggestRnaQcFilters.hpp>

Classes

struct  Defaults
 Default parameters. More...
 
struct  Thresholds
 Thresholds to define outliers on each metric. More...
 

Public Member Functions

SuggestRnaQcFiltersset_detected_num_mads (double n=Defaults::num_mads)
 
SuggestRnaQcFiltersset_sums_num_mads (double n=Defaults::num_mads)
 
SuggestRnaQcFiltersset_subset_num_mads (double n=Defaults::num_mads)
 
SuggestRnaQcFiltersset_num_mads (double n=Defaults::num_mads)
 
template<typename Float , typename Integer >
Thresholds run (size_t n, const PerCellRnaQcMetrics::Buffers< Float, Integer > &buffers) const
 
Thresholds run (const PerCellRnaQcMetrics::Results &metrics) const
 
template<typename Block , typename Float , typename Integer >
Thresholds run_blocked (size_t n, const Block *block, const PerCellRnaQcMetrics::Buffers< Float, Integer > &buffers) const
 
template<typename Block >
Thresholds run_blocked (const PerCellRnaQcMetrics::Results &metrics, const Block *block) const
 

Detailed Description

Create filters to identify low-quality cells from RNA-derived QC metrics.

Use an outlier-based approach on common QC metrics on the RNA data (see the PerCellRnaQcMetrics class) to identify low-quality cells. Specifically, low-quality cells are defined as those with:

Outliers are defined on each metric by counting the number of MADs from the median value across all cells. This assumes that most cells in the experiment are of high (or at least acceptable) quality; any anomalies are indicative of low-quality cells that should be filtered out. See the ChooseOutlierFilters class for implementation details.

For the total counts and number of detected features, the outliers are defined after log-transformation of the metrics. This improves resolution at low values and ensures that the defined threshold is not negative. Note that all thresholds are still reported on the original scale, so no further exponentiation is required.

For datasets with multiple blocks, SuggestRnaQcFilters::run_blocked() will compute block-specific thresholds for each metric. This assumes that differences in the metric distributions between blocks are driven by uninteresting technical causes (e.g., differences in sequencing depth); variable thresholds can adapt to each block's distribution for effective removal of outliers. However, if the differences in the distributions between blocks are primarily driven by biology, it may be preferable to ignore the blocking factor and use SuggestRnaQcFilters::run() on the entire dataset instead. This ensures that the same filter thresholds are consistently used for easier comparisons across blocks.

Member Function Documentation

◆ set_detected_num_mads()

SuggestRnaQcFilters & scran::SuggestRnaQcFilters::set_detected_num_mads ( double  n = Defaults::num_mads)
inline
Parameters
nNumber of MADs below the median, to define the threshold for outliers in the number of detected features. This should be non-negative.
Returns
Reference to this SuggestRnaQcFilters object.

◆ set_sums_num_mads()

SuggestRnaQcFilters & scran::SuggestRnaQcFilters::set_sums_num_mads ( double  n = Defaults::num_mads)
inline
Parameters
nNumber of MADs below the median, to define the threshold for outliers in the total count per cell. This should be non-negative.
Returns
Reference to this SuggestRnaQcFilters object.

◆ set_subset_num_mads()

SuggestRnaQcFilters & scran::SuggestRnaQcFilters::set_subset_num_mads ( double  n = Defaults::num_mads)
inline
Parameters
nNumber of MADs above the median, to define the threshold for outliers in the subset proportions. This should be non-negative.
Returns
Reference to this SuggestRnaQcFilters object.

◆ set_num_mads()

SuggestRnaQcFilters & scran::SuggestRnaQcFilters::set_num_mads ( double  n = Defaults::num_mads)
inline
Parameters
nNumber of MADs from the median, overriding previous calls to set_sums_num_mads() and counterparts. This should be non-negative.
Returns
Reference to this SuggestRnaQcFilters object.

◆ run() [1/2]

template<typename Float , typename Integer >
Thresholds scran::SuggestRnaQcFilters::run ( size_t  n,
const PerCellRnaQcMetrics::Buffers< Float, Integer > &  buffers 
) const
inline
Template Parameters
FloatFloating point type for the metrics.
IntegerInteger for the metrics.
Parameters
nNumber of cells.
[in]buffersPointers to arrays of length n, containing the per-cell RNA-derived metrics.
Returns
Filtering thresholds for each metric.

◆ run() [2/2]

Thresholds scran::SuggestRnaQcFilters::run ( const PerCellRnaQcMetrics::Results metrics) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
metricsCollection of arrays of length equal to the number of cells, containing the per-cell RNA-derived metrics.
Returns
Filtering thresholds for each metric.

◆ run_blocked() [1/2]

template<typename Block , typename Float , typename Integer >
Thresholds scran::SuggestRnaQcFilters::run_blocked ( size_t  n,
const Block *  block,
const PerCellRnaQcMetrics::Buffers< Float, Integer > &  buffers 
) const
inline
Template Parameters
BlockInteger type for the block assignments.
FloatFloating point type for the metrics.
IntegerInteger for the metrics.
Parameters
nNumber of cells.
[in]blockPointer to an array of length n, containing the block assignments for each cell. This may be NULL, in which case all cells are assumed to belong to the same block.
[in]buffersPointers to arrays of length n, containing the per-cell RNA-derived metrics.
Returns
Filtering thresholds for each metric in each block.

◆ run_blocked() [2/2]

template<typename Block >
Thresholds scran::SuggestRnaQcFilters::run_blocked ( const PerCellRnaQcMetrics::Results metrics,
const Block *  block 
) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Template Parameters
BlockInteger type for the block assignments.
Parameters
metricsCollection of arrays of length equal to the number of cells, containing the per-cell RNA-derived metrics.
[in]blockPointer to an array of length equal to the number of cells, containing the block assignments for each cell. This may be NULL, in which case all cells are assumed to belong to the same block.
Returns
Filtering thresholds for each metric in each block.

The documentation for this class was generated from the following file: