Computes marginal average treatment effects of a binary point treatment on multi-dimensional outcomes, adjusting for baseline covariates, using Targeted Minimum Loss-Based Estimation. A data-mining algorithm is used to perform biomarker selection before multiple testing to increase power.
adaptest(Y, A, W = NULL, n_top, n_fold, parameter_wrapper = rank_DE, learning_library = c("SL.glm", "SL.step", "SL.glm.interaction", "SL.gam", "SL.earth"), absolute = FALSE, negative = FALSE, p_cutoff = 0.05, q_cutoff = 0.05)
Y | (numeric vector) - A |
---|---|
A | (numeric vector) - binary treatment indicator:
|
W | (numeric vector, numeric matrix, or numeric data.frame) - matrix of baseline covariates where each column correspond to one baseline covariate and each row corresponds to one observation. |
n_top | (integer vector) - value for the number of candidate covariates to generate using the data-adaptive estimation algorithm |
n_fold | (integer vector) - number of cross-validation folds. |
parameter_wrapper | (function) - user-defined function that takes input
(Y, A, W, absolute, negative) and outputs a (integer vector) containing
ranks of biomarkers (outcome variables). For details, please refer to the
documentation for |
learning_library | (character vector) - library of learning algorithms to be used in fitting the "Q" and "g" step of the standard TMLE procedure. |
absolute | (logical) - whether or not to test for absolute effect size.
If |
negative | (logical) - whether or not to test for negative effect size.
If |
p_cutoff | (numeric) - p-value cutoff (default as 0.05) at and below which to be considered significant. Used in inference stage. |
q_cutoff | (numeric) - q-value cutoff (default as 0.05) at and below which to be considered significant. Used in multiple testing stage. |
S4 object of class data_adapt
, sub-classed from the container
class SummarizedExperiment
, with the following additional slots
containing data-mining selected biomarkers and their TMLE-based differential
expression and inference, as well as the original call to this function (for
user reference), respectively.
top_index
(integer vector) - indices for the data-mining
selected biomarkers
top_colname
(character vector) - names for the data-mining
selected biomarkers
top_colname_significant_q
(character vector) - names for the
data-mining selected biomarkers, which are significant after multiple
testing stage
DE
(numeric vector) - differential expression effect sizes for
the biomarkers in top_colname
p_value
(numeric vector) - p-values for the biomarkers in
top_colname
q_value
(numeric vector) - q-values for the biomarkers in
top_colname
significant_q
(integer vector) - indices of top_colname
which is significant after multiple testing stage.
mean_rank_top
(numeric vector) - average ranking across folds
of cross-validation folds for the biomarkers in top_colname
folds
(origami::folds class) - cross validation object
set.seed(1234) data(simpleArray) simulated_array <- simulated_array simulated_treatment <- simulated_treatment adaptest(Y = simulated_array, A = simulated_treatment, W = NULL, n_top = 5, n_fold = 3, learning_library = 'SL.glm', parameter_wrapper = adaptest::rank_DE, absolute = FALSE, negative = FALSE)#>#> [1] "The top covariates are" #> 3 5 10 4 6 840 128 #> 1 0.3333333 0.3333333 0.3333333 0.0000000 0.0000000 0.0000000 0.0000000 #> 2 0.0000000 0.0000000 0.0000000 0.3333333 0.3333333 0.3333333 0.0000000 #> 3 0.0000000 0.0000000 0.3333333 0.0000000 0.0000000 0.0000000 0.3333333 #> 4 0.3333333 0.3333333 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 #> 5 0.3333333 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 #> 567 505 1 519 #> 1 0.0000000 0.0000000 0.0000000 0.0000000 #> 2 0.0000000 0.0000000 0.0000000 0.0000000 #> 3 0.3333333 0.0000000 0.0000000 0.0000000 #> 4 0.0000000 0.3333333 0.0000000 0.0000000 #> 5 0.0000000 0.0000000 0.3333333 0.3333333 #> [1] "The ATE estiamtes are" #> [1] 0.51253445 0.03880535 -0.04632754 0.60661133 0.51471525 #> [1] "The raw p-values are" #> [1] 0.019986718 0.819997461 0.821458685 0.001454666 0.002002301 #> [1] "The adjusted p-values are" #> [1] 0.033311197 0.821458685 0.821458685 0.005005752 0.005005752 #> [1] "The top mean CV-rank are (the smaller the better)" #> [1] 3.333333 4.333333 5.000000 5.666667 12.000000 15.666667 #> [7] 43.333333 64.333333 167.000000 194.000000 281.000000 #> [1] "The percentage of appearing in top 5 are (the larger the better)" #> [1] 100.00000 66.66667 66.66667 33.33333 33.33333 33.33333 33.33333 #> [8] 33.33333 33.33333 33.33333 33.33333 #> [1] "The covariates still significant are" #> [1] 1 4 5 #> [1] "Their compositions are" #> 3 5 10 505 1 519 #> 1 0.3333333 0.3333333 0.3333333 0.0000000 0.0000000 0.0000000 #> 4 0.3333333 0.3333333 0.0000000 0.3333333 0.0000000 0.0000000 #> 5 0.3333333 0.0000000 0.0000000 0.0000000 0.3333333 0.3333333