Let's fit a simple regression model to the bicycle
data of Section . The dependent variable is
"2D separation and the independent variable is "2D travel-space. To
form a regression model use the "2D regression-model
function:
> (regression-model travel-space separation) Least Squares Estimates: Constant -2.182472 (1.056688) Variable 0: 0.6603419 (0.06747931) R Squared: 0.922901 Sigma hat: 0.5821083 Number of cases: 10 Degrees of freedom: 8 #<Object: 1966006, prototype = REGRESSION-MODEL-PROTO> >The basic syntax for the "2D regression-model function is
(regression-model x y)For a simple regression "2D x can be a single list or vector. For a multiple regression "2D x can be a list of lists or vectors or a matrix. The "2D regression-model function also takes three optional keyword arguments, "2D :intercept, "2D :print, and "2D :weights. Both "2D :intercept and "2D :print are "2D T by default. To get a model without an intercept use the expression
(regression-model x y :intercept nil)To form a weighted regression model use the expression
(regression-model x y :weights w)where "2D w is a list or vector of weights the same length as "2D y. The variances of the errors are assumed to be inversely proportional to the weights "2D w.
The "2D regression-model function prints a very simple summary of the fit model and returns a model object as its result. To be able to examine the model further assign the returned model object to a variable using an expression like 18
(def bikes (regression-model travel-space separation :print nil))I have given the keyword argument "2D :print nil to suppress the printing of the summary, since we have already seen it. To find out what messages are available use the "2D :help message:
> (send bikes :help) REGRESSION-MODEL-PROTO Normal Linear Regression Model Help is available on the following: :ADD-METHOD :ADD-SLOT :BASIS :COEF-ESTIMATES :COEF-STANDARD-ERRORS :COMPUTE :DF :DISPLAY :DOC-TOPICS :DOCUMENTATION :FIT-VALUES :GET-METHOD :HAS-METHOD :HAS-SLOT :HELP :INTERCEPT :INTERNAL-DOC :ISNEW :LEVERAGES :MESSAGE-SELECTORS :METHODS :NEW :NUM-CASES :NUM-COEFS :PARENTS :PLOT-BAYES-RESIDUALS :PLOT-RESIDUALS :PRECEDENCE-LIST :PRINT :R-SQUARED :RESIDUALS :SAVE :SHOW :SIGMA-HAT :SLOT-NAMES :SLOT-VALUE :SLOTS :SUM-OF-SQUARES :WEIGHTS :X :X-MATRIX :XTXINV :Y PROTO NIL >Many of these messages are self explanatory, and many have already been used by the "2D :display message, which "2D regression-model sends to the new model to print the summary. As examples let's try the "2D :coef-estimates and "2D :coef-standard-errors messages 19:
> (send bikes :coef-estimates) (-2.182472 0.6603419) > (send bikes :coef-standard-errors) (1.056688 0.06747931) >
The "2D :plot-residuals message will produce a residual plot . To find out what residuals are plotted against let's look at the help information:
> (send bikes :help :plot-residuals) :PLOT-RESIDUALS Message args: (&optional x-values) Opens a window with a plot of the residuals. If X-VALUES are not supplied the fitted values are used. The plot can be linked to other plots with the link-views function. Returns a plot object. NIL >Using the expressions
(plot-points travel-space separation) (send bikes :plot-residuals travel-space)we can construct two plots of the data as shown in Figure
The plots both suggest that there is some curvature in the data; this curvature is particularly pronounced in the residual plot if you ignore observation 6 for the moment. To allow for this curvature we might try to fit a model with a quadratic term in "2D travel-space:
> (def bikes2 (regression-model (list travel-space (^ travel-space 2)) separation)) Least Squares Estimates: Constant -16.41924 (7.848271) Variable 0: 2.432667 (0.9719628) Variable 1: -0.05339121 (0.02922567) R Squared: 0.9477923 Sigma hat: 0.5120859 Number of cases: 10 Degrees of freedom: 7 BIKES2 >I have used the exponentiation function ``
^
'' to compute the
square of travel-space. Since I am now forming a multiple regression
model the first argument to
"2D regression-model is a list of the "2D x variables.
You can proceed in many directions from this point. If you want to calculate Cook's distances for the observations you can first compute internally studentized residuals as
(def studres (/ (send bikes2 :residuals) (* (send bikes2 :sigma-hat) (sqrt (- 1 (send bikes2 :leverages))))))Then Cooks distances are obtained as 20
> (* (^ studres 2) (/ (send bikes2 :leverages) (- 1 (send bikes2 :leverages)) 3)) (0.166673 0.00918596 0.03026801 0.01109897 0.009584418 0.1206654 0.581929 0.0460179 0.006404474 0.09400811)The seventh entry – observation 6, counting from zero – clearly stands out.
Another approach to examining residuals for possible outliers is to
use the Bayesian residual plot proposed
by Chaloner and Brant [6], which can be obtained
using the message "2D :plot-bayes-residuals
. The expression
"2D (send bikes2 :plot-bayes-residuals) produces the plot in
Figure .