home *** CD-ROM | disk | FTP | other *** search
-
- _C_o_m_p_u_t_e _a _s_u_r_v_i_v_a_l _C_u_r_v_e _f_o_r _C_e_n_s_o_r_e_d _D_a_t_a _C_o_m_p_u_t_e_s _a_n _e_s_t_i_-
- _m_a_t_e _o_f _a _s_u_r_v_i_v_a_l _c_u_r_v_e _f_o_r _c_e_n_s_o_r_e_d _d_a_t_a _u_s_i_n_g _e_i_t_h_e_r _t_h_e
- _K_a_p_l_a_n-_M_e_i_e_r _o_r _t_h_e _F_l_e_m_i_n_g-_H_e_r_r_i_n_g_t_o_n _m_e_t_h_o_d _o_r _c_o_m_p_u_t_e_s
- _t_h_e _p_r_e_d_i_c_t_e_d _s_u_r_v_i_v_o_r _f_u_n_c_t_i_o_n _f_o_r _a _c_o_x _p_r_o_p_o_r_t_i_o_n_a_l
- _h_a_z_a_r_d_s _m_o_d_e_l.
-
- survfit( object, data=sys.parent(), weights, subset, na.action,
- newdata, individual=F, conf.int=.95, se.fit=T,
- type=c("kaplan-meier","flemington-harrington", "fh2"),
- error=c("greenwood","tsiatis"),
- conf.type=c("log","log-log","plain","none"),
- conf.lower=c("usual", "peto", "modified"))
-
- _A_r_g_u_m_e_n_t_s:
-
- object:
- A formula object or a coxph object. If a formula
- object is supplied it must have a Surv object as the
- response on the left of the ~ operator and, if desired,
- terms separated by + operators on the right. One of
- the terms may be a strata object. For a single sur-
- vival curve the "~ 1" part of the formula is not
- required.
-
- data:
- a data.frame in which to interpret the variables named
- in the formula, or in the subset and the weights argu-
- ment.
-
- weights:
- The weights must be nonnegative and it is strongly
- recommended that they be strictly positive, since zero
- weights are ambiguous, compared to use of the subset
- argument.
-
- subset:
- expression saying that only a subset of the rows of the
- data should be used in the fit.
-
- na.action:
- a missing-data filter function, applied to the
- model.frame, after any subset argument has been used.
- Default is options()a.action.
-
- newdata:
- a data.frame with the same variable names as those that
- appear in the coxph formula. The curve(s) produced
- will be representative of a cohort who's covariates
- correspond to the values in newdata. Default is the
- mean of the covariates used in the coxph fit.
-
- individual:
- a logical value indicating whether the data frame
- represents different time epochs for only one indivi-
- dual (T), or whether multiple rows indicate multiple
- individuals (F, the default). If the former only one
- curve will be produced; if the latter there will be one
- curve per row in newdata.
-
- conf.int:
- The level for a two-sided confidence interval on the
- survival curve(s). Default is 0.95.
-
- se.fit:
- a logical value indicating whether standard errors
- should be computed. Default is true.
-
- type:
- either "kaplan-meier" , "fleming-harrington" or "fh2",
- (only the first two characters are necessary).
- The default is "fleming-harrington" when a coxph
- object is given, and it is "kaplan-meier" otherwise.
-
- _A_r_g_u_m_e_n_t_s:
-
- error:
- either the string "greenwood" for the Greenwood formula
- or "tsiatis" for the Tsiatis formula, (only the
- first character is necessary). The default is
- "tsiatis" when a coxph object is given, and it is
- "greenwood" otherwise.
-
- conf.type:
- One of "none", "plain", "log" (the default), or "log-
- log". Only enough of the string to uniquely identify
- it is necessary. The first option causes confidence
- intervals not to be generated. The second causes the
- standard intervals "curve +- k *se(curve)", where k is
- determined from conf.int. The log option calculates
- intervals based on the cumulative hazard or
- log(survival). The last option bases intervals on the
- log hazard or log(-log(survival)). These last will
- never extend past 0 or 1.
-
- conf.lower:
- controls modified lower limits to the curve, the upper
- limit remains unchanged. The modified lower limit is
- based on an `effective n' argument. The confidence
- bands will agree with the usual calculation at each
- death time, but unlike the usual bands the confidence
- interval becomes wider at each censored observation.
- The extra width is obtained by multiplying the usual
- variance by a factor m/n, where n is the number
- currently at risk and m is the number at risk at the
- last death time. (The bands thus agree with the un-
- modified bands at each death time.) This is especially
- useful for survival curves with a long flat tail. The
- Peto lower limit is based on the same effective n argu-
- ment as the modified limit, but also replaces the usual
- Greenwood variance term with a simple approximation.
- It is known to be conservative.
-
- Value:
-
- a survfit object. Methods defined for survfit objects
- are print, plot, lines, and points. Actually, the
- estimates used are the Kalbfleisch-Prentice
- (Kalbfleisch and Prentice, 1980, p.86) and the
- Tsiatis/Link/Breslow, which reduce to the Kaplan-Meier
- and Fleming-Harrington estimates, respectively, when
- the weights are unity. When curves are fit for a Cox
- model, subject weights of exp(sum(coef*(x-center))) are
- used, ignoring any value for wt input by the user.
- There is also an extra term in the variance of the
- curve, due to the variance of coef and hence variance
- in the computed weights. The Greenwood formula for the
- variance is a sum of terms d/(n*(n-m)), where d is the
- number of deaths at a given time point, n is the sum of
- wt for all individuals still at risk at that time, and
- m is the sum of weights for the deaths at that time.
- The justification is based on a binomial argument when
- weights are all equal to one; extension to the weighted
- case is ad hoc. Tsiatis (1981) proposes a sum of terms
- d/(n*n), based on a counting process argument which
- includes the weighted case. The two variants of the
- F-H estimate have to do with how ties are handled. If
- there were 3 deaths out of 10 at risk, then the first
- would increment the hazard by 3/10 and the second by
- 1/10 + 1/9 + 1/8. For curves created after a Cox model
- these correspond to the Breslow and Efron estimates,
- respectively, and the proper choice is made automati-
- cally. The fh2 method will give results closer to the
- Kaplan-Meier. Based on the work of Link (1984), the
- log transform is expected to produce the most accurate
- confidence intervals. If there is heavy censoring,
- then based on the work of Dorey and Korn (1987) the
- modified estimate will give a more reliable confidence
- band for the tails of the curve.
-
- References:
-
- Terry Therneau, author of local function. Dorey, F.J.
- and Korn, E.L. (1987). Effective sample sizes for con-
- fidence intervals for survival probabilities. Statis-
- tics in Medicine 6, 679-87. Fleming, T. H. and Har-
- rington, D.P. (1984). Nonparametric estimation of the
- survival distribution in censored data. Comm. in
- Statistics 13, 2469-86. Kablfleisch, J. D. and Pren-
- tice, R. L. (1980). The Statistical Analysis of
- Failure Time Data. Wiley, New York. Link, C. L.
- (1984). Confidence intervals for the survival function
- using Cox's proportional hazards model with covariates.
- Biometrics 40, 601-610. Tsiatis, A. (1981). A large
- sample study of the estimate for the integrated hazard
- function in Cox's regression model for survival data.
- Annals of Statistics 9, 93-108.
-
- print, plot, lines, coxph, Surv, strata.
-
- _E_x_a_m_p_l_e_s:
-
- #fit a Kaplan-Meier and plot it
- fit <- survfit(Surv(time, status) ~ x, data=aml)
- plot(fit)
- # plot only 1 of the 2 curves from above
- plot(fit[2])
- #fit a cox proportional hazards model and plot the
- #predicted survival curve
- fit <- coxph( Surv(admlfuhr, dead) ~ gcs.12, rochadm)
- plot( survfit( fit))
-
-