Fit a penalized discrete survival model (without provider information)
Source:R/DiscSurv.R
DiscSurv.Rd
Main function for fitting a penalized discrete survival model without provider information
Usage
DiscSurv(
data,
Event.char,
Z.char,
Time.char,
standardize = T,
lambda,
nlambda = 100,
lambda.min.ratio = 1e-04,
penalize.x = rep(1, length(Z.char)),
penalized.multiplier,
nvar.max = p,
stop.dev.ratio = 0.001,
bound = 10,
backtrack = FALSE,
tol = 1e-04,
max.each.iter = 10000,
max.total.iter = (max.each.iter * nlambda),
actSet = TRUE,
actIter = max.each.iter,
actVarNum = sum(penalize.x == 1),
actSetRemove = F,
returnX = FALSE,
trace.lambda = FALSE,
threads = 1,
MM = FALSE,
return.transform.data = FALSE,
...
)
Arguments
- data
an
dataframe
orlist
object that contains the variables in the model.- Event.char
name of the event indicator in
data
as a character string. Event indicator should be a binary variable with 1 indicating that the event has occurred and 0 indicating (right) censoring.- Z.char
names of covariates in
data
as vector of character strings.- Time.char
name of the follow up time in
data
as a character string.- standardize
logical flag for x variable standardization, prior to fitting the model sequence. The coefficients are always returned on the original scale. Default is
standardize=TRUE
.- lambda
a user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on
nlambda
andlambda.min.ratio
.- nlambda
the number of lambda values. Default is 100.
- lambda.min.ratio
the fraction of the smallest value for lambda with
lambda.max
(smallest lambda for which all coefficients are zero) on log scale. Default is 1e-04.- penalize.x
a vector indicates whether the corresponding covariate will be penalized. If equals 0, variable is unpenalized, else is penalized. Default is a vector of 1's (all covariates are penalized).
- penalized.multiplier
A vector of values representing multiplicative factors by which each covariate's penalty is to be multiplied. Default is a vector of 1's.
- nvar.max
number of maximum selected variables. Default is the number of all covariates.
- bound
a positive number to avoid inflation of provider effect. Default is 10.
- backtrack
for updating the provider effect, whether to use the "backtracking line search" with Newton method.
- tol
convergence threshold. For each lambda, the program will stop if the maximum change of covariate coefficient is smaller than
tol
. Default is 1e-4.- max.each.iter
maximum number of iterations for each lambda. Default is 1e4.
- max.total.iter
maximum number of iterations for entire path. Default is
max.each.iter
*nlambda
.- actSet
whether to use the active method for variable selection. Default is TRUE.
- actIter
if
actSet = TRUE
, the maximum number of iterations for a new updated active set. Default ismax.each.iter
(i.e. we will update the current active set until convergence ).- actSetRemove
if
actSet = TRUE
, whether we remove the zero coefficients from the current active set. Default is FALSE.- returnX
whether return the standardized design matrix. Default is FALSE.
- trace.lambda
whether display the progress for fitting the entire path. Default is FALSE.
- threads
number of cores that are used for parallel computing.
- MM
whether we use the "Majorize-Minimization" algorithm to optimize the objective function.
- ...
extra arguments to be passed to function.
Value
An object with S3 class DiscSurv
.
- beta
the fitted matrix of covariate coefficients. The number of rows is equal to the number of coefficients, and the number of columns is equal to nlambda.
- alpha
the fitted value of logit-transformed baseline hazard.
- lambda
the sequence of
lambda
values in the path.- df
the estimates of effective number of selected variables all the points along the regularization path.
- iter
the number of iterations until convergence at each value of
lambda
.
References
K. He, J. Kalbfleisch, Y. Li, and et al. (2013) Evaluating hospital readmission rates in dialysis facilities; adjusting for hospital effects.
Lifetime Data Analysis, 19: 490-512.
Examples
data(DiscTime)
data <- DiscTime$data
Event.char <- DiscTime$Event.char
Z.char <- DiscTime$Z.char
Time.char <- DiscTime$Time.char
fit <- DiscSurv(data, Event.char, Z.char, Time.char) # fit a discrete survival model without any given provider information.
fit$beta[, 1:5] # covariate coefficient
#> 0.4261 0.3882 0.3537 0.3223 0.2937
#> Z1 0 -0.1556065 -0.2988513 -0.4321738 -0.555182124
#> Z2 0 0.0000000 0.0000000 0.0000000 0.000000000
#> Z3 0 0.0000000 0.0000000 0.0000000 0.000000000
#> Z4 0 0.0000000 0.0000000 0.0000000 0.000000000
#> Z5 0 0.0000000 0.0000000 0.0000000 -0.003012663
fit$alpha[, 1:5] #time effect
#> 0.4261 0.3882 0.3537 0.3223 0.2937
#> [Time: 0.53] -1.979501 -1.970055 -1.977003 -1.996580 -2.025996
#> [Time: 1.03] -2.110213 -2.069229 -2.043936 -2.030857 -2.027110
#> [Time: 3.92] -2.484907 -2.418504 -2.367851 -2.329504 -2.300530
#> [Time: 6.74] -2.355695 -2.272311 -2.204520 -2.148985 -2.102809
#> [Time: 12.5] -2.958691 -2.863140 -2.784495 -2.719193 -2.664056