Main function for fitting a penalized discrete survival model
Usage
pp.DiscSurv(
data,
Event.char,
prov.char,
Z.char,
Time.char,
lambda,
nlambda = 100,
lambda.min.ratio = 1e-04,
penalize.x = rep(1, length(Z.char)),
penalized.multiplier,
lambda.early.stop = FALSE,
nvar.max = p,
stop.dev.ratio = 0.001,
bound = 10,
backtrack = FALSE,
tol = 1e-04,
max.each.iter = 10000,
max.total.iter = (max.each.iter * nlambda),
actSet = TRUE,
actIter = max.each.iter,
actVarNum = sum(penalize.x == 1),
actSetRemove = F,
returnX = FALSE,
trace.lambda = FALSE,
threads = 1,
MM = FALSE,
return.transform.data = FALSE,
...
)
Arguments
- data
an
dataframe
orlist
object that contains the variables in the model.- Event.char
name of the event indicator in
data
as a character string. Event indicator should be a binary variable with 1 indicating that the event has occurred and 0 indicating (right) censoring.- prov.char
name of provider IDs variable in
data
as a character string.- Z.char
names of covariates in
data
as vector of character strings.- Time.char
name of the follow up time in
data
as a character string.- lambda
a user supplied lambda sequence. Typical usage is to have the program compute its own lambda sequence based on
nlambda
andlambda.min.ratio
.- nlambda
the number of lambda values. Default is 100.
- lambda.min.ratio
the fraction of the smallest value for lambda with
lambda.max
(smallest lambda for which all coefficients are zero) on log scale. Default is 1e-04.- penalize.x
a vector indicates whether the corresponding covariate will be penalized. If equals 0, variable is unpenalized, else is penalized. Default is a vector of 1's (all covariates are penalized).
- penalized.multiplier
A vector of values representing multiplicative factors by which each covariate's penalty is to be multiplied. Default is a vector of 1's.
- lambda.early.stop
whether the program stop before running the entire sequence of lambda. Early stop based on the ratio of deviance for models under two successive lambda. Default is
FALSE
.- nvar.max
number of maximum selected variables. Default is the number of all covariates.
- stop.dev.ratio
if
lambda.early.stop = TRUE
, the ratio of deviance for early stopping. Default is 1e-3.- bound
a positive number to avoid inflation of provider effect. Default is 10.
- backtrack
for updating the provider effect, whether to use the "backtracking line search" with Newton method.
- tol
convergence threshold. For each lambda, the program will stop if the maximum change of covariate coefficient is smaller than
tol
. Default is 1e-4.- max.each.iter
maximum number of iterations for each lambda. Default is 1e4.
- max.total.iter
maximum number of iterations for entire path. Default is
max.each.iter
*nlambda
.- actSet
whether to use the active method for variable selection. Default is TRUE.
- actIter
if
actSet = TRUE
, the maximum number of iterations for a new updated active set. Default ismax.each.iter
(i.e. we will update the current active set until convergence ).- actSetRemove
if
actSet = TRUE
, whether we remove the zero coefficients from the current active set. Default is FALSE.- returnX
whether return the standardized design matrix. Default is FALSE.
- trace.lambda
whether display the progress for fitting the entire path. Default is FALSE.
- threads
number of cores that are used for parallel computing.
- MM
whether we use the "Majorize-Minimization" algorithm to optimize the objective function.
- ...
extra arguments to be passed to function.
Value
An object with S3 class ppDiscSurv
.
- beta
the fitted matrix of covariate coefficients. The number of rows is equal to the number of coefficients, and the number of columns is equal to nlambda.
- alpha
the fitted value of logit-transformed baseline hazard.
- gamma
the fitted value of provider effects. The effect of the first provider is set to be reference group.
- lambda
the sequence of
lambda
values in the path.- df
the estimates of effective number of selected variables all the points along the regularization path.
- iter
the number of iterations until convergence at each value of
lambda
.
References
K. He, J. Kalbfleisch, Y. Li, and et al. (2013) Evaluating hospital readmission rates in dialysis facilities; adjusting for hospital effects.
Lifetime Data Analysis, 19: 490-512.
Examples
data(DiscTime)
data <- DiscTime$data
Event.char <- DiscTime$Event.char
prov.char <- DiscTime$prov.char
Z.char <- DiscTime$Z.char
Time.char <- DiscTime$Time.char
fit <- pp.DiscSurv(data, Event.char, prov.char, Z.char, Time.char)
fit$beta[, 1:5]
#> 0.1601 0.1458 0.1329 0.1211 0.1103
#> Z1 0 -0.1735757 -0.3325774 -0.4793679 -0.6155647
#> Z2 0 0.0000000 0.0000000 0.0000000 0.0000000
#> Z3 0 0.0000000 0.0000000 0.0000000 0.0000000
#> Z4 0 0.0000000 0.0000000 0.0000000 0.0000000
#> Z5 0 0.0000000 0.0000000 0.0000000 0.0000000
fit$alpha[, 1:5]
#> 0.1601 0.1458 0.1329 0.1211 0.1103
#> [Time: 0.53] -1.668065 -1.747753 -1.824209 -1.898663 -1.971909
#> [Time: 1.03] -1.480994 -1.526815 -1.571328 -1.615488 -1.659886
#> [Time: 3.92] -1.622145 -1.641556 -1.661848 -1.683563 -1.706991
#> [Time: 6.74] -1.251585 -1.257340 -1.264124 -1.272503 -1.282795
#> [Time: 12.5] -1.781635 -1.780095 -1.780526 -1.783340 -1.788771
fit$gamma[, 1:5] #effect of the first provider is set to be zero
#> 0.1601 0.1458 0.1329 0.1211 0.1103
#> 1 0.0000000 0.0000000 0.000000 0.0000000 0.0000000
#> 2 -4.5665147 -4.3516946 -4.162992 -3.9945779 -3.8425705
#> 3 -0.7478311 -0.7116142 -0.682481 -0.6580980 -0.6367834
#> 4 1.2412090 1.1456446 1.059286 0.9815198 0.9121089
#> 5 -2.3399410 -2.1966372 -2.070536 -1.9577270 -1.8555876