scran
C++ library for basic single-cell RNA-seq analyses
Loading...
Searching...
No Matches
Classes | Public Member Functions | List of all members
scran::SuggestCrisprQcFilters Class Reference

Create filters to identify low-quality cells from CRISPR-derived QC metrics. More...

#include <SuggestCrisprQcFilters.hpp>

Classes

struct  Defaults
 Default parameters. More...
 
struct  Thresholds
 Thresholds to define outliers on each metric. More...
 

Public Member Functions

SuggestCrisprQcFiltersset_num_mads (double n=Defaults::num_mads)
 
template<typename Float , typename Integer >
Thresholds run (size_t n, const PerCellCrisprQcMetrics::Buffers< Float, Integer > &buffers) const
 
Thresholds run (const PerCellCrisprQcMetrics::Results &metrics) const
 
template<typename Block , typename Float , typename Integer >
Thresholds run_blocked (size_t n, const Block *block, const PerCellCrisprQcMetrics::Buffers< Float, Integer > &buffers) const
 
template<typename Block >
Thresholds run_blocked (const PerCellCrisprQcMetrics::Results &metrics, const Block *block) const
 

Detailed Description

Create filters to identify low-quality cells from CRISPR-derived QC metrics.

In CRISPR guide count matrices, the QC filtering decisions are somewhat different than those for the other modalities. Here, low-quality cells are defined as those with:

Directly defining a threshold on the maximum count is somewhat tricky as unsuccessful transfection is not uncommon. This often results in a large subpopulation with low maximum counts, inflating the MAD and compromising the threshold calculation. Instead, we use the following approach:

  1. Compute the median of the proportion of counts in the most abundant guide (i.e., the maximum proportion),
  2. Subset the cells to only those with maximum proportions above the median,
  3. Define a threshold for low outliers on the log-transformed maximum count within the subset.

This assumes that over 50% of cells were successfully transfected with a single guide construct and have high maximum proportions. In contrast, unsuccessful transfections will be dominated by ambient contamination and have low proportions. By taking the subset above the median proportion, we remove all of the unsuccessful transfections and enrich for mostly-high-quality cells. From there, we can apply the usual outlier detection methods on the maximum count, with log-transformation to avoid a negative threshold.

Keep in mind that the maximum proportion is only used to define the subset for threshold calculation. Once the maximum count threshold is computed, they are applied to all cells, regardless of their maximum proportions. This allows us to recover good cells that would have been filtered out by our aggressive median subset. It also ensures that we do not remove cells transfected with multiple guides - such cells are not necessarily uninteresting, e.g., for examining interaction effects, so we will err on the side of caution and leave them in.

For datasets with multiple blocks, SuggestCrisprQcFilters::run_blocked() will compute block-specific thresholds for the maximum count. See comments in SuggestRnaQcFilters for more details.

Member Function Documentation

◆ set_num_mads()

SuggestCrisprQcFilters & scran::SuggestCrisprQcFilters::set_num_mads ( double  n = Defaults::num_mads)
inline
Parameters
nNumber of MADs below the median, to define the threshold for outliers in the maximum count. This should be non-negative.
Returns
Reference to this SuggestCrisprQcFilters object.

◆ run() [1/2]

template<typename Float , typename Integer >
Thresholds scran::SuggestCrisprQcFilters::run ( size_t  n,
const PerCellCrisprQcMetrics::Buffers< Float, Integer > &  buffers 
) const
inline
Template Parameters
FloatFloating point type for the metrics.
IntegerInteger for the metrics.
Parameters
nNumber of cells.
[in]buffersPointers to arrays of length n, containing the per-cell CRISPR-derived metrics.
Returns
Filtering thresholds for each metric.

◆ run() [2/2]

Thresholds scran::SuggestCrisprQcFilters::run ( const PerCellCrisprQcMetrics::Results metrics) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Parameters
metricsCollection of arrays of length equal to the number of cells, containing the per-cell CRISPR-derived metrics.
Returns
Filtering thresholds for each metric.

◆ run_blocked() [1/2]

template<typename Block , typename Float , typename Integer >
Thresholds scran::SuggestCrisprQcFilters::run_blocked ( size_t  n,
const Block *  block,
const PerCellCrisprQcMetrics::Buffers< Float, Integer > &  buffers 
) const
inline
Template Parameters
BlockInteger type for the block assignments.
FloatFloating point type for the metrics.
IntegerInteger for the metrics.
Parameters
nNumber of cells.
[in]blockPointer to an array of length n, containing the block assignments for each cell. This may be NULL, in which case all cells are assumed to belong to the same block.
[in]buffersPointers to arrays of length n, containing the per-cell CRISPR-derived metrics. Only max_proportion and sums are used; detected is ignored and does not need to be set.
Returns
Filtering thresholds for each metric in each block.

◆ run_blocked() [2/2]

template<typename Block >
Thresholds scran::SuggestCrisprQcFilters::run_blocked ( const PerCellCrisprQcMetrics::Results metrics,
const Block *  block 
) const
inline

This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.

Template Parameters
BlockInteger type for the block assignments.
Parameters
metricsCollection of arrays of length equal to the number of cells, containing the per-cell CRISPR-derived metrics. Only max_proportion and sums are used; detected is ignored and does not need to be set.
[in]blockPointer to an array of length equal to the number of cells, containing the block assignments for each cell. This may be NULL, in which case all cells are assumed to belong to the same block.
Returns
Filtering thresholds for each metric in each block.

The documentation for this class was generated from the following file: