home *** CD-ROM | disk | FTP | other *** search
-
- _P_e_r_s_o_n _Y_e_a_r_s _T_h_i_s _f_u_n_c_t_i_o_n _c_o_m_p_u_t_e_s _t_h_e _p_e_r_s_o_n-_y_e_a_r_s _o_f
- _f_o_l_l_o_w-_u_p _t_i_m_e _c_o_n_t_r_i_b_u_t_e_d _b_y _a _c_o_h_o_r_t _o_f _s_u_b_j_e_c_t_s, _s_t_r_a_-
- _t_i_f_e_d _i_n_t_o _s_u_b_g_r_o_u_p_s. _I_t _a_l_s_o _c_o_m_p_u_t_e_s _t_h_e _n_u_m_b_e_r _o_f _s_u_b_-
- _j_e_c_t_s _w_h_o _c_o_n_t_r_i_b_u_t_e _t_o _e_a_c_h _c_e_l_l _o_f _t_h_e _o_u_t_p_u_t _t_a_b_l_e, _a_n_d
- _o_p_t_i_o_n_a_l_l_y _t_h_e _n_u_m_b_e_r _o_f _e_v_e_n_t_s _a_n_d/_o_r _e_x_p_e_c_t_e_d _n_u_m_b_e_r _o_f
- _e_v_e_n_t_s _i_n _e_a_c_h _c_e_l_l.
-
- pyears(formula, data, weights, subset, na.action, ratetable=survexp.us,
- scale=365.25, expect=c('event', 'pyears'), model=F, x=F, y=F)
-
- _A_r_g_u_m_e_n_t_s:
-
- formula:
- a formula object. The response variable will be a vec-
- tor of follow-up times for each subject, or a Surv
- object containing the follow-up time and an event indi-
- cator. The predictors consist of optional grouping
- variables separated by + operators (exactly as in
- survfit), time-dependent grouping variables such as age
- (specified with tcut), and optionally a ratetable()
- term. This latter matches each subject to his/her
- expected cohort.
-
- data:
- a data frame in which to interpret the variables named
- in the formula.
-
- weights:
- case weights.
-
- subset:
- expression stating that only a subset of the rows
- should be used.
-
- na.action:
- missing data filter function.
-
- ratetable:
- a table of event rates, such as survexp.uswhite.
-
- scale:
- a scaling for the results. As most rate tables are in
- units/day, the default value of 365.25 causes the out-
- put to be reported in years.
-
- expected:
- should the output table include the expected number of
- events, or the expected number of person-years of
- observation. This is only valid with a rate table.
-
- model, x, y:
- If any of these is true, then the model frame, the
- model matrix, and/or the vector of response times will
- be returned as components of the final result.
-
- Value:
-
- a list with components
-
- pyears:
- an array containing the person-years of exposure. (Or other
- units, depending on the rate table and the scale).
-
- n:
- an array containing the number of subjects who contribute
- time to each cell of the pyears array.
-
- event:
- an array containing the observed number of events. This
- will be present only if the resonse variable is a Surv
- object.
-
- expected:
- an array containing the expected number of events (or person
- years). This will be present only if there was a ratetable
- term.
-
- offtable:
- the number of person-years of exposure in the cohort that
- was not part of any cell in the pyears array. This is often
- useful as an error check; if there is a mismatch of units
- between two variables, nearly all the person years may be
- off table.
-
- summary:
- a summary of the rate-table matching. This is also useful
- as an error check.
-
- call:
- an image of the call to the function.
-
- na.action:
- the na.action attribute contributed by an na.action routine,
- if any. Because pyears may have several time variables, it
- is necessary that all of them be in the same units. For
- instance in the call
-
- py <- pyears(futime ~ rx + ratetable(age=age, sex=sex, year=entry.dt))
- with a ratetable whose natural unit is days, it is important
- that futime, age and entry.dt all be in days. Given the
- wide range of possible inputs, it is difficult for the rou-
- tine to do sanity checks of this aspect. A special function
- tcut is needed to specify time-dependent cutpoints. For
- instance, assume that age is in years, and that the desired
- final arrays have as one of their margins the age groups 0-
- 2, 2-10, 10-25, and 25+. A subject who enters the study at
- age 4 and remains under observation for 10 years will con-
- tribute follow-up time to both the 2-10 and 10-25 subsets.
- If cut(age, c(0,2,10,25,100)) were used in the formula, the
- subject would be classifed according to his starting age
- only. The tcut function has the same arguments as cut, but
- produces a different output object which allows the pyears
- function to correctly track the subject. The results of
- pyears() are normally used as input to further calculations.
- The example below is from a study of hip fracture rates from
- 1930 - 1990 in Rochester, Minnesota. Survival post hip
- fracture has increased over that time, but so has the sur-
- vival of elderly subjects in the population at large. A
- model of relative survival helps to clarify what has hap-
- pened: Poisson regression is used, but replacing exposure
- time with expected exposure (for an age and sex matched con-
- trol). Death rates change with age, of course, so the
- result is carved into 1 year increments of time. Males and
- females were done separately.
-
- _E_x_a_m_p_l_e_s:
-
- attach(malehips)
- temp1 <- tcut(dt.fracture, seq(from=mdy.date(1,1,30), by=365.25, length=61))
- temp2 <- tcut(age*365.5, 365.25*(0:105)) #max age was > 100!
- pfit <- pyears(Surv(futime, status) ~ temp1 + temp2 +
- ratetable(age=age*365.25, year=dt.fracture, sex=1),
- subset=(sex==1),
- ratetable=survexp.minnwhite)
- cat(pfitummary)
- age ranges from 50.1 to 110.5 years
- male: 374 female: 1578
- date of entry from 29Jun29 to 18Dec92
- # now, convert the arrays into a data frame
- tdata <- data.frame( age = (0:105)[col(pfityears)],
- yr = (1930:1990)[row(pfityears)],
- y = c(pfitvent),
- time = c(pfitxpect))
- # fit the gam model
- gfit.m <- gam(y ~ s(age) + s(yr) + offset(log(time)), family=poisson,
- data= tdata)
- plot(gfit.m, se=T)
-
- ratetable, survexp, Surv
-
-