Computes precision weights that account for heteroscedasticity in RNA-seq count data based on non-parametric local linear regression estimates.
Arguments
- y
a numeric matrix of size
G x ncontaining the raw RNA-seq counts or preprocessed expression fromnsamples forGgenes.- x
a numeric matrix of size
n x pcontaining the model covariate(s) fromnsamples (design matrix).- phi
a numeric design matrix of size
n x Kcontaining the K variable(s) of interest( e.g. bases of time).- use_phi
a logical flag indicating whether conditional means should be conditioned on
phiand on covariate(s)x, or onxalone. Default isTRUEin which case conditional means are estimated conditionally on bothxandphi.- preprocessed
a logical flag indicating whether the expression data have already been preprocessed (e.g. log2 transformed). Default is
FALSE, in which caseyis assumed to contain raw counts and is normalized into log(counts) per million.- gene_based
a logical flag indicating whether to estimate weights at the gene-level. Default is
FALSE, when weights will be estimated at the observation-level.- bw
a character string indicating the smoothing bandwidth selection method to use. See
bandwidthfor details. Possible values are'ucv','SJ','bcv','nrd'or'nrd0'. Default is'nrd'.- kernel
a character string indicating which kernel should be used. Possibilities are
'gaussian','epanechnikov','rectangular','triangular','biweight','tricube','cosine','optcosine'. Default is'gaussian'(NB:'tricube'kernel corresponds to the loess method).- transform
a logical flag indicating whether values should be transformed to uniform for the purpose of local linear smoothing. This may be helpful if tail observations are sparse and the specified bandwidth gives suboptimal performance there. Default is
TRUE.- verbose
a logical flag indicating whether informative messages are printed during the computation. Default is
TRUE.- na.rm
logical: should missing values (including
NAandNaN) be omitted from the calculations? Default isFALSE.
Value
a list containing the following components:
weights: a matrixn x Gcontaining the computed precision weightsplot_utilities: a list containing the following elements:reverse_trans: a function encoding the reverse function used for smoothing the observations before computing the weightsmethod: the weight computation method ("loclin")smth: the vector of the smoothed values computedgene_based: a logical indicating whether the computed weights are based on average at the gene level or on individual observationsmu: the transformed observed counts or averagesv: the observed variability estimates
See also
bandwidth density