scran
C++ library for basic single-cell RNA-seq analyses
|
Create filters to identify low-quality cells from ADT-derived QC metrics. More...
#include <SuggestAdtQcFilters.hpp>
Classes | |
struct | Defaults |
Default parameters. More... | |
struct | Thresholds |
Thresholds to define outliers on each metric. More... | |
Public Member Functions | |
SuggestAdtQcFilters & | set_detected_num_mads (double n=Defaults::num_mads) |
SuggestAdtQcFilters & | set_subset_num_mads (double n=Defaults::num_mads) |
SuggestAdtQcFilters & | set_num_mads (double n=Defaults::num_mads) |
SuggestAdtQcFilters & | set_min_detected_drop (double m=Defaults::min_detected_drop) |
template<typename Float , typename Integer > | |
Thresholds | run (size_t n, const PerCellAdtQcMetrics::Buffers< Float, Integer > &buffers) const |
Thresholds | run (const PerCellAdtQcMetrics::Results &metrics) const |
template<typename Block , typename Float , typename Integer > | |
Thresholds | run_blocked (size_t n, const Block *block, const PerCellAdtQcMetrics::Buffers< Float, Integer > &buffers) const |
template<typename Block > | |
Thresholds | run_blocked (const PerCellAdtQcMetrics::Results &metrics, const Block *block) const |
Create filters to identify low-quality cells from ADT-derived QC metrics.
In antibody-derived tag (ADT) count matrices, the QC filtering decisions are slightly different than those for RNA count matrices (see SuggestAdtQcFilters
for the latter). Here, low-quality cells are defined as those with:
We define a threshold on each metric based on a certain number of MADs from the median. This assumes that most cells in the experiment are of high (or at least acceptable) quality; any outliers are indicative of low-quality cells that should be filtered out. See the ChooseOutlierFilters
class for implementation details.
For the number of detected features and the total IgG counts, the outliers are defined after log-transformation of the metrics. This improves resolution at low values and ensures that the defined threshold is not negative. Note that all thresholds are still reported on the original scale, so no further exponentiation is required.
For the number of detected features, we supplement the MAD-based threshold with a minimum drop. Cells are only considered to be low quality if the difference in the number of detected features from the median is greater than a certain percentage. By default, the number must drop by at least 10% from the median. This avoids overly aggressive filtering when the MAD is zero due to the discrete nature of this statistic in datasets with few tags.
For datasets with multiple blocks, SuggestAdtQcFilters::run_blocked()
will compute block-specific thresholds for each metric. See comments in SuggestRnaQcFilters
for more details.
|
inline |
n | Number of MADs below the median, to define the threshold for outliers in the number of detected features. This should be non-negative. |
SuggestAdtQcFilters
object.
|
inline |
n | Number of MADs above the median, to define the threshold for outliers in the total count for each subset. This should be non-negative. |
SuggestAdtQcFilters
object.
|
inline |
n | Number of MADs from the median, overriding previous calls to set_detected_num_mads() and set_subset_num_mads() . This should be non-negative. |
SuggestAdtQcFilters
object.
|
inline |
m | Minimum drop in the number of detected features from the median, in order to consider a cell to be of low quality. This should lie in $[0, 1)$. |
SuggestAdtQcFilters
object.
|
inline |
Float | Floating point type for the metrics. |
Integer | Integer for the metrics. |
n | Number of cells. | |
[in] | buffers | Pointers to arrays of length n , containing the per-cell ADT-derived metrics. |
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
metrics | Collection of arrays of length equal to the number of cells, containing the per-cell ADT-derived metrics. |
|
inline |
Block | Integer type for the block assignments. |
Float | Floating point type for the metrics. |
Integer | Integer for the metrics. |
n | Number of cells. | |
[in] | block | Pointer to an array of length n , containing the block assignments for each cell. This may be NULL , in which case all cells are assumed to belong to the same block. |
[in] | buffers | Pointers to arrays of length n , containing the per-cell ADT-derived metrics. Only detected and subset_totals are used; sums does not need to be set. |
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Block | Integer type for the block assignments. |
metrics | Collection of arrays of length equal to the number of cells, containing the per-cell ADT-derived metrics. Only detected and subset_totals are used; sums does not need to be set. | |
[in] | block | Pointer to an array of length equal to the number of cells, containing the block assignments for each cell. This may be NULL , in which case all cells are assumed to belong to the same block. |