Cross-Validation for Cox MDTL with Ridge Regularization
cv.cox_MDTL_ridge.RdPerforms k-fold cross-validation to simultaneously tune the hyperparameter eta
(transfer learning weight) and the regularization parameter lambda for the
Cox MDTL model with a Ridge penalty (L2-norm).
This function evaluates the model performance across a grid of eta and lambda
values. It is efficient for high-dimensional data where an Elastic Net penalty is not required,
focusing purely on Ridge regression to handle multicollinearity and overfitting.
Usage
cv.cox_MDTL_ridge(
z,
delta,
time,
stratum = NULL,
beta = NULL,
vcov = NULL,
etas,
lambda = NULL,
nlambda = 100,
lambda.min.ratio = ifelse(n_obs < n_vars, 0.01, 1e-04),
nfolds = 5,
cv.criteria = c("V&VH", "LinPred", "CIndex_pooled", "CIndex_foldaverage"),
c_index_stratum = NULL,
message = FALSE,
seed = NULL,
...
)Arguments
- z
A numeric matrix or data frame of covariates (n x p).
- delta
A numeric vector of event indicators (1 = event, 0 = censored).
- time
A numeric vector of observed times.
- stratum
Optional numeric or factor vector indicating strata. If
NULL, all subjects are assumed to be in the same stratum.- beta
A numeric vector of external coefficients (length p).
- vcov
Optional numeric matrix (p x p) representing the weighting matrix \(Q\) for the Mahalanobis penalty. Typically the inverse covariance matrix. If
NULL, defaults to the identity matrix.- etas
A numeric vector of candidate
etavalues to be evaluated.- lambda
Optional user-supplied lambda sequence. If
NULL, the function computes its own sequence based onnlambda.- nlambda
The number of
lambdavalues. Default is 100.- lambda.min.ratio
Smallest value for
lambda, as a fraction oflambda.max. Default depends on the sample size relative to the number of predictors.- nfolds
Integer. Number of cross-validation folds. Default is 5.
- cv.criteria
Character string specifying the cross-validation criterion. Choices are:
"V&VH"(default): Verweij & Van Houwelingen partial likelihood loss."LinPred": Loss based on the prognostic performance of the linear predictor."CIndex_pooled": Harrell's C-index computed by pooling predictions across folds."CIndex_foldaverage": Harrell's C-index computed within each fold and averaged.
- c_index_stratum
Optional stratum vector. Required only when
cv.criteriainvolves stratified C-index calculation but the model itself is unstratified.- message
Logical. If
TRUE, progress messages are printed.- seed
Optional integer. Random seed for reproducible fold assignment.
- ...
Additional arguments passed to the underlying fitting function.
Value
An object of class "cv.cox_MDTL_ridge" containing:
bestA list containing the optimal results:
best_eta: The selected eta value.best_lambda: The selected lambda value.best_beta: The coefficient vector corresponding to the optimal parameters.criteria: The selection criterion used.
integrated_stat.full_resultsA data frame of performance metrics for all combinations of eta and lambda.
integrated_stat.best_per_etaA data frame summarizing the best lambda and performance metric for each eta.
integrated_stat.betahat_bestA matrix of coefficients for the best lambda at each eta.
criteriaThe selection criterion used.
nfoldsThe number of folds used.
Examples
if (FALSE) { # \dontrun{
data(ExampleData_highdim)
train_dat_highdim <- ExampleData_highdim$train
beta_external_highdim <- ExampleData_highdim$beta_external
eta_list <- generate_eta(method = "exponential", n = 50, max_eta = 10)
cv.cox_MDTL_ridge_est <- cv.cox_MDTL_ridge(
z = train_dat_highdim$z,
delta = train_dat_highdim$status,
time = train_dat_highdim$time,
stratum = train_dat_highdim$stratum,
beta = beta_external_highdim,
vcov = NULL,
etas = eta_list,
cv.criteria = "CIndex_pooled"
)
} # }