Association testing using Han & Lahiri estimating equations and jackknife approach
Source:R/test_han2018.R
test_han2018.Rd
Association testing using Han & Lahiri estimating equations and jackknife approach
Arguments
- match_prob
matching probabilities matrix (e.g. obtained through
recordLink
) of dimensionsn1 x n2
.- y
response variable of length
n1
. Only binary or gaussian phenotypes are supported at the moment.- x
a
matrix
or adata.frame
of predictors of dimensionsn2 x p
. An intercept is automatically added within the function.- covar_y
a
matrix
or adata.frame
of predictors of dimensionsn1 x q1
. An intercept is automatically added within the function.- covar_x
a
matrix
or adata.frame
of predictors of dimensionsn2 x q2
. An intercept is automatically added within the function.- jackknife_nrep
the number of jackknife repetitions. Default is 100 (from Han et al.).
- jackknife_blocksize
the number of observations to remove in each jackknife.
- methods
a character vector which must be a subset of
("F", "M", "M2")
indicating which estimator from Han et al. 2018 should be computed. Default is all 3.- dist_family
a character string indicating the distribution family for the glm. Currently, only
'gaussian'
and'binomial'
are supported. Default is'gaussian'
.
Value
a list containing the following for each estimator in methods
:
beta
a vector containing thep
estimated coefficientsvarcov
thep x p
variance-covariancematrix
of thebeta
coefficientszscores
a vector containing thep
Z-scorespval
the corresponding Gaussian assumption p-values
References
Han, Y., and Lahiri, P. (2019) Statistical Analysis with Linked Data. International Statistical Review, 87: S139– S157. doi:10.1111/insr.12295 .
Examples
# rm(list=ls())
# n_sims <- 500
# res <- pbapply::pblapply(1:n_sims, function(n){
# nx <- 99
# ny <- 103
# x <- matrix(ncol=2, nrow=ny, stats::rnorm(n=ny*2))
#
# #plot(density(rbeta(n=1000, 1,2)))
# match_prob <- diag(ny)[, 1:nx]#matrix(rbeta(n=ny*nx, 1, 2), nrow=ny, ncol=99)
#
# covar_y <- matrix(rnorm(n=ny, 1, 0.5), ncol=1)
# covar_x <- matrix(ncol=3, nrow=ny, stats::rnorm(n=ny*3))
#
# #y <- rnorm(n=ny, mean = x %*% c(2,-3) + covar_x %*% rep(0.2, ncol(covar_x)) + 0.5*covar_y, 0.5)
# y <- rbinom(n=ny, 1, prob=expit(x %*% c(2,-3) + covar_x %*%
# rep(0.2, ncol(covar_x)) + 0.5*covar_y))
# #glm(y~0+x+covar_y+covar_x, family = "binomial")
# return(
# #test_han2018(match_prob, y, x, jackknife_blocksize = 10, covar_x = NULL, covar_y = NULL)
# test_han2018(match_prob, y[1:ny], x[1:nx, ], dist_family = "binomial",
# jackknife_blocksize = 10, covar_x = covar_x[1:nx, ],
# covar_y = covar_y[1:ny, , drop=FALSE])
# )
# }, cl=parallel::detectCores()-1)
# pvals_F <- sapply(lapply(res, "[[", "F"), "[[", "beta")
# pvals_M <- sapply(lapply(res, "[[", "M"), "[[", "beta")
# pvals_M2 <- sapply(lapply(res, "[[", "M2"), "[[", "beta")
# quantile(pvals_F)
# quantile(pvals_M)
# quantile(pvals_M2)
# rowMeans(pvals_F<0.05)
# rowMeans(pvals_M<0.05)
# rowMeans(pvals_M2<0.05)