This functions computes p-values frm score tests of genetic pathway risk association in 5 different models
Arguments
- data
a
data.frameofNrows and set up as the output fromsim_SCR_datawith columns:XR: time to recurrence / death / censoringXD: time to death / censoringDeltaR: indicator of censoring (0), recurrence (1), or death (2) for this earliest timeXRDeltaD: indicator of censoring (0) or death (1)XPFS: time to recurrence / death / censoring (=XR)DeltaPFS: indicator of censoring (0) or recurrence or death, whichever came first (1)Z_1,...,Z_P: genomic variables
- ind_gene
columns indices of genes in the pathway of interest. Default is
7:ncol(data)).- num_perts
number of perturbations used. Default is
1000.- Ws
optional inputed perturbations, should be a vector of length
N x num_pertscontaining i.i.d. realization of a random variable with mean=0 and variance=1.- rho
a vector of rhos, such as one found created from the range returned by
findRhoInterval, used for tuning non-linear kernel. Only used ifkernelis not"linear". Default isNA. Currently not available for use by user-defined kernels.- kernel
a character string indicating which kernel is used. Possible values (currently implemented) are
"linear","gaussian"or"poly". Otherwise, this can also be a user defined kernel function. SeegenericKernelEval.- d
if
kernelis"poly", the polynomial power. Default is 2 (quadratic kernel).- pca_thres
a number between
0and1giving the threshold to be used for PCA. Default is0.9. IfNULL, no PCA is performed.- get_ptb_pvals
a logical flag indicating whether perturbed p-values should be returned as part of the results. Default is
FALSE.- ...
extra parameters to be passed to a user-defined kernel.
Value
either a vector of p-values for 5 different models with names:
"SCR":Semi-Competing Risks
"PFS":Progression Free Survival
"CR":Competing Risks
"OS":Overall Survival
"SCR_alt":SCR allowing different tuning parameters for the two event time processes
or else if get_ptb_pvals is TRUE, a list with 2 elements:
"obs_pvals":a vector containing the observed p-values for each of the 5 models as described above
"null_pvals_perts":a matrix of dimensions
num_perts x 5containing the corresponding perturbed p-values
References
Neykov M, Hejblum BP, Sinnot JA, Kernel Machine Score Test for Pathway Analysis in the Presence of Semi-Competing Risks, submitted, 2016.
Examples
## First generate some Data
feat_m_fun <- function(X){
sin(X[,1]+X[,2]^2)-1
}
feat_d_fun <- function(X){
(X[,4]-X[,5])^2/8
}
mydata <- sim_SCR_data(data_size = 400, ncol_gene_mat = 20, feat_m = feat_m_fun,
feat_d = feat_d_fun, mu_cen = 40, cov=0.5)
#initial range
ind_gene <- c(7:ncol(mydata))
my_rho_init <- seq(0.01, 20, length=300)*length(ind_gene)
range(my_rho_init)
#> [1] 0.2 400.0
if(interactive()){
# compute the interval for rho
rho_set <- findRhoInterval(tZ=t(mydata[,ind_gene]), rho_init = my_rho_init, kernel="gaussian")
rho_set
range(my_rho_init) # good to check that the interval produced here is strictly contained in rho_init
# otherwise, expand rho.init and rerun
rhos <- exp(seq(log(rho_set[1]),log(rho_set[2]), length=50))
# run the tests with Gaussian kernel
compute_all_tests(data = mydata, num_perts=1000, rho=rhos, kernel="gaussian")
# run the tests with linear kernel
compute_all_tests(data=mydata, num_perts=1000, kernel="linear")
}