Stability Selection for MDTL-Integrated Cox Elastic-Net Models
cox_MDTL_enet.StabSelect.RdPerforms stability selection for the Mahalanobis-distance–based transfer-learning
Cox elastic-net model (cox_MDTL_enet) by repeatedly refitting the model on
bootstrap or subsampled datasets and aggregating variable selection frequencies
across replicates. This procedure yields a more robust measure of variable
importance that is less sensitive to a single data split.
Usage
cox_MDTL_enet.StabSelect(
z,
delta,
time,
stratum = NULL,
beta,
vcov = NULL,
etas = NULL,
alpha = 1,
lambda = NULL,
nlambda = 100,
lambda.min.ratio = 0.1,
nfolds = 5,
cv.criteria = c("V&VH", "LinPred", "CIndex_pooled", "CIndex_foldaverage"),
c_index_stratum = NULL,
message = FALSE,
seed = NULL,
B = 50,
fraction_sample = 0.5,
ncores = 1,
...
)Arguments
- z
A numeric matrix or data frame of covariates (n x p).
- delta
A numeric vector of event indicators (1 = event, 0 = censored).
- time
A numeric vector of observed times.
- stratum
Optional numeric or factor vector indicating strata. If
NULL, all subjects are assumed to be in the same stratum.- beta
A numeric vector of external coefficients (length p).
- vcov
Optional numeric matrix (p x p) representing the weighting matrix \(Q\) for the Mahalanobis penalty. Typically the inverse covariance matrix. If
NULL, defaults to the identity matrix.- etas
A numeric vector of candidate
etavalues to be evaluated.- alpha
The Elastic Net mixing parameter, with \(0 \le \alpha \le 1\).
alpha = 1is the Lasso penalty, andalpha = 0is the Ridge penalty. IfNULL, defaults to 1 (Lasso).- lambda
Optional user-supplied lambda sequence. If
NULL, typical usage is to have the program compute its ownlambdasequence based onnlambdaandlambda.min.ratio.- nlambda
The number of
lambdavalues. Default is 100.- lambda.min.ratio
Smallest value for
lambda, as a fraction oflambda.max. Default depends on the sample size relative to the number of predictors.- nfolds
Integer. Number of cross-validation folds. Default is 5.
- cv.criteria
Character string specifying the cross-validation criterion. Choices are:
"V&VH"(default): Verweij & Van Houwelingen partial likelihood loss."LinPred": Loss based on the prognostic performance of the linear predictor."CIndex_pooled": Harrell's C-index computed by pooling predictions across folds."CIndex_foldaverage": Harrell's C-index computed within each fold and averaged.
- c_index_stratum
Optional stratum vector. Required only when
cv.criteriainvolves stratified C-index calculation but the model itself is unstratified.- message
Logical. If
TRUE, progress messages are printed.- seed
Optional integer. Random seed for reproducible fold assignment.
- B
Integer. Number of bootstrap/subsampling replicates used for stability selection. Default is
50.- fraction_sample
Numeric in
(0, 1]. Fraction of the original sample size used in each replicate. Default is0.5.- ncores
Integer. Number of parallel cores. Default 1 (sequential execution).
- ...
Additional arguments passed to the underlying fitting function.
Value
An object of class "StabSelect" containing:
stability_path— a numeric matrix storing selection probabilities for each variable–lambdapair across theBreplicates.lambda— the globallambdasequence used for the underlying elastic-net fits.
Examples
if (FALSE) { # \dontrun{
data(ExampleData_highdim)
train_dat_highdim <- ExampleData_highdim$train
beta_external_highdim <- ExampleData_highdim$beta_external
eta_list <- generate_eta(method = "exponential", n = 10, max_eta = 10)
mdtl.StabSelect <- cox_MDTL_enet.StabSelect(
z = train_dat_highdim$z,
delta = train_dat_highdim$status,
time = train_dat_highdim$time,
stratum = train_dat_highdim$stratum,
beta = beta_external_highdim,
vcov = NULL,
etas = eta_list,
cv.criteria = "CIndex_pooled",
B = 20,
message = TRUE
)
# Visualize selection with a chosen threshold
plot(mdtl.StabSelect, threshold = 0.75)
} # }