This functions computes p-values frm
score tests of genetic pathway risk association in 5 different models
Arguments
- data
a
data.frame
ofN
rows and set up as the output fromsim_SCR_data
with columns:XR
: time to recurrence / death / censoringXD
: time to death / censoringDeltaR
: indicator of censoring (0), recurrence (1), or death (2) for this earliest timeXR
DeltaD
: indicator of censoring (0) or death (1)XPFS
: time to recurrence / death / censoring (=XR
)DeltaPFS
: indicator of censoring (0) or recurrence or death, whichever came first (1)Z_1,...,Z_P
: genomic variables
- ind_gene
columns indices of genes in the pathway of interest. Default is
7:ncol(data)
).- num_perts
number of perturbations used. Default is
1000
.- Ws
optional inputed perturbations, should be a vector of length
N x num_perts
containing i.i.d. realization of a random variable with mean=0 and variance=1.- rho
a vector of rhos, such as one found created from the range returned by
findRhoInterval
, used for tuning non-linear kernel. Only used ifkernel
is not"linear"
. Default isNA
. Currently not available for use by user-defined kernels.- kernel
a character string indicating which kernel is used. Possible values (currently implemented) are
"linear"
,"gaussian"
or"poly"
. Otherwise, this can also be a user defined kernel function. SeegenericKernelEval
.- d
if
kernel
is"poly"
, the polynomial power. Default is 2 (quadratic kernel).- pca_thres
a number between
0
and1
giving the threshold to be used for PCA. Default is0.9
. IfNULL
, no PCA is performed.- get_ptb_pvals
a logical flag indicating whether perturbed p-values should be returned as part of the results. Default is
FALSE
.- ...
extra parameters to be passed to a user-defined kernel.
Value
either a vector
of p-values for 5 different models with names:
"SCR"
:Semi-Competing Risks
"PFS"
:Progression Free Survival
"CR"
:Competing Risks
"OS"
:Overall Survival
"SCR_alt"
:SCR allowing different tuning parameters for the two event time processes
or else if get_ptb_pvals
is TRUE
, a list
with 2 elements:
"obs_pvals"
:a vector containing the observed p-values for each of the 5 models as described above
"null_pvals_perts"
:a matrix of dimensions
num_perts x 5
containing the corresponding perturbed p-values
References
Neykov M, Hejblum BP, Sinnot JA, Kernel Machine Score Test for Pathway Analysis in the Presence of Semi-Competing Risks, submitted, 2016.
Examples
## First generate some Data
feat_m_fun <- function(X){
sin(X[,1]+X[,2]^2)-1
}
feat_d_fun <- function(X){
(X[,4]-X[,5])^2/8
}
mydata <- sim_SCR_data(data_size = 400, ncol_gene_mat = 20, feat_m = feat_m_fun,
feat_d = feat_d_fun, mu_cen = 40, cov=0.5)
#initial range
ind_gene <- c(7:ncol(mydata))
my_rho_init <- seq(0.01, 20, length=300)*length(ind_gene)
range(my_rho_init)
#> [1] 0.2 400.0
if(interactive()){
# compute the interval for rho
rho_set <- findRhoInterval(tZ=t(mydata[,ind_gene]), rho_init = my_rho_init, kernel="gaussian")
rho_set
range(my_rho_init) # good to check that the interval produced here is strictly contained in rho_init
# otherwise, expand rho.init and rerun
rhos <- exp(seq(log(rho_set[1]),log(rho_set[2]), length=50))
# run the tests with Gaussian kernel
compute_all_tests(data = mydata, num_perts=1000, rho=rhos, kernel="gaussian")
# run the tests with linear kernel
compute_all_tests(data=mydata, num_perts=1000, kernel="linear")
}