|
ModelGeneVariances & | set_span (double s=FitVarianceTrend::Defaults::span) |
|
ModelGeneVariances & | set_minimum_mean (double m=FitVarianceTrend::Defaults::minimum_mean) |
|
ModelGeneVariances & | set_use_fixed_width (bool u=FitVarianceTrend::Defaults::use_fixed_width) |
|
ModelGeneVariances & | set_fixed_width (double f=FitVarianceTrend::Defaults::fixed_width) |
|
ModelGeneVariances & | set_minimum_window_count (int c=FitVarianceTrend::Defaults::minimum_window_count) |
|
ModelGeneVariances & | set_block_weight_policy (WeightPolicy w=Defaults::block_weight_policy) |
|
ModelGeneVariances & | set_variable_block_weight_parameters (VariableBlockWeightParameters v=Defaults::variable_block_weight_parameters) |
|
ModelGeneVariances & | set_compute_average (bool a=Defaults::compute_average) |
|
ModelGeneVariances & | set_num_threads (int n=Defaults::num_threads) |
|
template<typename Value_ , typename Index_ , typename Stat_ > |
void | run (const tatami::Matrix< Value_, Index_ > *mat, Stat_ *means, Stat_ *variances, Stat_ *fitted, Stat_ *residuals) const |
|
template<typename Value_ , typename Index_ , typename Block_ , typename Stat_ > |
void | run_blocked (const tatami::Matrix< Value_, Index_ > *mat, const Block_ *block, std::vector< Stat_ * > means, std::vector< Stat_ * > variances, std::vector< Stat_ * > fitted, std::vector< Stat_ * > residuals, Stat_ *ave_means, Stat_ *ave_variances, Stat_ *ave_fitted, Stat_ *ave_residuals) const |
|
template<typename Value_ , typename Index_ , typename Block_ , typename Stat_ > |
void | run_blocked (const tatami::Matrix< Value_, Index_ > *mat, const Block_ *block, std::vector< Stat_ * > means, std::vector< Stat_ * > variances, std::vector< Stat_ * > fitted, std::vector< Stat_ * > residuals) const |
|
template<typename Value_ , typename Index_ > |
Results | run (const tatami::Matrix< Value_, Index_ > *mat) const |
|
template<typename Value_ , typename Index_ , typename Block_ > |
BlockResults | run_blocked (const tatami::Matrix< Value_, Index_ > *mat, const Block_ *block) const |
|
Compute and model the per-gene variances in log-expression data.
This scans through a log-transformed normalized expression matrix (e.g., from LogNormCounts
) and computes per-feature means and variances. It then fits a trend to the variances with respect to the means using FitVarianceTrend
. We assume that most genes at any given abundance are not highly variable, such that the fitted value of the trend is interpreted as the "uninteresting" variance - this is mostly attributed to technical variation like sequencing noise, but can also represent constitutive biological noise like transcriptional bursting. Under this assumption, the residual can be treated as a quantification of biologically interesting variation, and can be used to identify relevant features for downstream analyses.
template<typename Value_ , typename Index_ , typename Block_ , typename Stat_ >
void scran::ModelGeneVariances::run_blocked |
( |
const tatami::Matrix< Value_, Index_ > * |
mat, |
|
|
const Block_ * |
block, |
|
|
std::vector< Stat_ * > |
means, |
|
|
std::vector< Stat_ * > |
variances, |
|
|
std::vector< Stat_ * > |
fitted, |
|
|
std::vector< Stat_ * > |
residuals, |
|
|
Stat_ * |
ave_means, |
|
|
Stat_ * |
ave_variances, |
|
|
Stat_ * |
ave_fitted, |
|
|
Stat_ * |
ave_residuals |
|
) |
| const |
|
inline |
Compute and model the per-feature variances from a log-expression matrix with blocking. The mean and variance of each gene is computed separately for all cells in each block, and a separate trend is fitted to each block to obtain residuals. This ensures that sample and batch effects do not confound the variance estimates.
We also compute the average of each statistic across blocks, using the weighting strategy described in weight_block()
. The average residual is particularly useful for feature selection with ChooseHVGs
.
- Template Parameters
-
Value_ | Data type of the matrix. |
Index_ | Integer type for the row/column indices. |
Block_ | Integer type to hold the block IDs. |
Stat_ | Floating-point type for the output statistics. |
- Parameters
-
| mat | Pointer to a feature-by-cells tatami matrix containing log-expression values. |
[in] | block | Pointer to an array of length equal to the number of cells, containing a 0-based block ID for each cell - see tabulate_ids() for more details. This can also be a nullptr , in which case all cells are assumed to belong to the same block. |
[out] | means | Vector of length equal to the number of blocks, containing pointers to output arrays of length equal to the number of rows in mat . Each vector stores the mean of each feature in the corresponding block of cells. |
[out] | variances | Vector of length equal to the number of blocks, containing pointers to output arrays of length equal to the number of rows in mat . Each vector stores the variance of each feature in the corresponding block of cells. |
[out] | fitted | Vector of length equal to the number of blocks, containing pointers to output arrays of length equal to the number of rows in mat . Each vector stores the fitted value of the trend for each feature in the corresponding block of cells. |
[out] | residuals | Vector of length equal to the number of blocks, containing pointers to output arrays of length equal to the number of rows in mat . Each vector stores the residual from the trend for each feature in the corresponding block of cells. |
[out] | ave_means | Pointer to an array of length equal to the number of rows in mat , storing the average mean across blocks for each gene. If nullptr , the average calculation is skipped. |
[out] | ave_variances | Pointer to an array of length equal to the number of rows in mat , storing the average variance across blocks for each gene. If nullptr , the average calculation is skipped. |
[out] | ave_fitted | Pointer to an array of length equal to the number of rows in mat , storing the average fitted value across blocks for each gene. If nullptr , the average calculation is skipped. |
[out] | ave_residuals | Pointer to an array of length equal to the number of rows in mat , storing the average residual across blocks for each gene. If nullptr , the average calculation is skipped. |