scran
C++ library for basic single-cell RNA-seq analyses
|
Create filters to identify low-quality cells from RNA-derived QC metrics. More...
#include <SuggestRnaQcFilters.hpp>
Classes | |
struct | Defaults |
Default parameters. More... | |
struct | Thresholds |
Thresholds to define outliers on each metric. More... | |
Public Member Functions | |
SuggestRnaQcFilters & | set_detected_num_mads (double n=Defaults::num_mads) |
SuggestRnaQcFilters & | set_sums_num_mads (double n=Defaults::num_mads) |
SuggestRnaQcFilters & | set_subset_num_mads (double n=Defaults::num_mads) |
SuggestRnaQcFilters & | set_num_mads (double n=Defaults::num_mads) |
template<typename Float , typename Integer > | |
Thresholds | run (size_t n, const PerCellRnaQcMetrics::Buffers< Float, Integer > &buffers) const |
Thresholds | run (const PerCellRnaQcMetrics::Results &metrics) const |
template<typename Block , typename Float , typename Integer > | |
Thresholds | run_blocked (size_t n, const Block *block, const PerCellRnaQcMetrics::Buffers< Float, Integer > &buffers) const |
template<typename Block > | |
Thresholds | run_blocked (const PerCellRnaQcMetrics::Results &metrics, const Block *block) const |
Create filters to identify low-quality cells from RNA-derived QC metrics.
Use an outlier-based approach on common QC metrics on the RNA data (see the PerCellRnaQcMetrics
class) to identify low-quality cells. Specifically, low-quality cells are defined as those with:
Outliers are defined on each metric by counting the number of MADs from the median value across all cells. This assumes that most cells in the experiment are of high (or at least acceptable) quality; any anomalies are indicative of low-quality cells that should be filtered out. See the ChooseOutlierFilters
class for implementation details.
For the total counts and number of detected features, the outliers are defined after log-transformation of the metrics. This improves resolution at low values and ensures that the defined threshold is not negative. Note that all thresholds are still reported on the original scale, so no further exponentiation is required.
For datasets with multiple blocks, SuggestRnaQcFilters::run_blocked()
will compute block-specific thresholds for each metric. This assumes that differences in the metric distributions between blocks are driven by uninteresting technical causes (e.g., differences in sequencing depth); variable thresholds can adapt to each block's distribution for effective removal of outliers. However, if the differences in the distributions between blocks are primarily driven by biology, it may be preferable to ignore the blocking factor and use SuggestRnaQcFilters::run()
on the entire dataset instead. This ensures that the same filter thresholds are consistently used for easier comparisons across blocks.
|
inline |
n | Number of MADs below the median, to define the threshold for outliers in the number of detected features. This should be non-negative. |
SuggestRnaQcFilters
object.
|
inline |
n | Number of MADs below the median, to define the threshold for outliers in the total count per cell. This should be non-negative. |
SuggestRnaQcFilters
object.
|
inline |
n | Number of MADs above the median, to define the threshold for outliers in the subset proportions. This should be non-negative. |
SuggestRnaQcFilters
object.
|
inline |
n | Number of MADs from the median, overriding previous calls to set_sums_num_mads() and counterparts. This should be non-negative. |
SuggestRnaQcFilters
object.
|
inline |
Float | Floating point type for the metrics. |
Integer | Integer for the metrics. |
n | Number of cells. | |
[in] | buffers | Pointers to arrays of length n , containing the per-cell RNA-derived metrics. |
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
metrics | Collection of arrays of length equal to the number of cells, containing the per-cell RNA-derived metrics. |
|
inline |
Block | Integer type for the block assignments. |
Float | Floating point type for the metrics. |
Integer | Integer for the metrics. |
n | Number of cells. | |
[in] | block | Pointer to an array of length n , containing the block assignments for each cell. This may be NULL , in which case all cells are assumed to belong to the same block. |
[in] | buffers | Pointers to arrays of length n , containing the per-cell RNA-derived metrics. |
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Block | Integer type for the block assignments. |
metrics | Collection of arrays of length equal to the number of cells, containing the per-cell RNA-derived metrics. | |
[in] | block | Pointer to an array of length equal to the number of cells, containing the block assignments for each cell. This may be NULL , in which case all cells are assumed to belong to the same block. |