home *** CD-ROM | disk | FTP | other *** search
-
- _K-_f_o_l_d _c_r_o_s_s-_v_a_l_i_d_a_t_i_o_n
-
- crossval(x,y,theta.fit,theta.predict,...,ngroup=n)
-
- _A_r_g_u_m_e_n_t_s:
-
- x: a matrix containing the predictor (regressor)
- values. Each row corresponds to an observa-
- tion.
-
- y: a vector containing the response values
-
- theta.fit: function to be cross-validated. Takes x and y
- as an argument. See example below.
-
- theta.predict: function producing predicted values for
- theta.fit. Arguments are a matrix x of pred-
- ictors and fit object produced by theta.fit.
- See example below.
-
- ngroup: optional argument specifying the number of
- groups formed . Default is ngroup=sample
- size, corresponding to leave-one out cross-
- validation.
-
- _V_a_l_u_e_s:
-
- list with the following components
-
- cv.fit: The cross-validated fit for each observation.
- The numbers 1 to n (the sample size) are parti-
- tioned into ngroup mutually disjoint groups of
- size "leave.out". leave.out, the number of obser-
- vations in each group, is the integer part of
- n/ngroup. The groups are chosen at random if
- ngroup < n. (If n/leave.out is not an integer,
- the last group will contain > leave.out observa-
- tions). Then theta.fit is applied with the kth
- group of observations deleted, for k=1, 2, ngroup.
- Finally, the fitted value is computed for the kth
- group using theta.predict.
-
- ngroup: The number of groups
-
- leave.out: The number of observations in each group
-
- groups: A list of length ngroup containing the indices of
- the observations in each group. Only returned if
- leave.out > 1.
-
- _R_e_f_e_r_e_n_c_e_s:
-
- Stone, M. (1974). Cross-validation choice and assess-
- ment of statistical predictions. Journal of the Royal
- Statistical Society, B-36, 111--147.
-
- Efron, B. and Tibshirani, R. (1993) An Introduction to
- the Bootstrap. Chapman and Hall, New York, London.
-
- _E_x_a_m_p_l_e_s:
-
- # cross-validation of least squares regression
- # note that crossval is not very efficient, and being a
- # general purpose function, it does not use the
- # Sherman-Morrison identity for this special case
- x <- rnorm(85)
- y <- 2*x +.5*rnorm(85)
- theta.fit <- function(x,y)lsfit(x,y)
- theta.predict <- function(fit,x)
- cbind(1,x)%*%fitoef
-
- results <- crossval(x,y,theta.fit,theta.predict,ngroup=6)
-
-