home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Between Heaven & Hell 2
/
BetweenHeavenHell.cdr
/
500
/
473
/
multi.arc
/
FORGET.DOC
next >
Wrap
Text File
|
1985-11-06
|
12KB
|
337 lines
FORGET-IT
Tukey's Forget-it Plots
Version 1.0
November 4, 1985
Gerard E. Dallal
USDA Human Nutrition Research Center on Aging
at Tufts University
711 Washington Street
Boston, MA 02111
and
Tufts University School of Nutrition
132 Curtis Street
Medford, MA 02155
NOTICE
Documentation and original code copyright 1985 by Gerard E.
Dallal. Reproduction of material for non-commercial purposes
is permitted, without charge, provided that suitable
reference is made to FORGET-IT and its author.
Neither FORGET-IT nor its documentation should be modified in
any way without permission from the author, except for those
changes that are essential to move FORGET-IT to another
computer.
DESCRIPTION
Forget-it plots (a.k.a. two-way plots) were introduced by
Tukey (1970, chapter 16) as a graphical technique for
representing the interaction structure in a two-way table.
PAGE 2
CONSTRUCTION
(The reader should refer to the example at the end of this
document while reading this discussion. Alternate
descriptions can be found in Mosteller and Tukey (1977,
chapter 9) and Tukey (1977, chapter 11).)
Let {w(i,j): i=1,...,r; j=1,...,c} be a two-way table fitted
by the model
E w(i,j) = mu + alpha(i) + beta(j) .
Permute the rows(columns) in order of decreasing(increasing)
row(column) effects. Let U be an estimate of mu, A(i) be an
estimate of alpha(i) and B(j) be an estimate of beta(j) for
the ORDERED table. (The program computes least squares
estimates subject to the usual constraints, but this is
immaterial to the plotting process).
Horizontal (X) and vertical (Y) axes are set down and the two
sets of line segments
{Y=U+2*A(i)-A(1)+B(1)+X,
for (A(1)-A(i)).LE. X .LE. (A(1)-A(i))+(B(c)-B(1)),
i=1,...r}
{Y=U+A(1)+2*B(j)-B(1)-X,
for (B(j)-B(1)).LE. X .LE. (B(j)-B(1))+(A(1)-A(r)),
j=1,...c}
are drawn. The ordinates of the points of intersection
correspond to the fitted values since they satisfy
Y = U + A(i) + B(j).
Observed table entries are plotted directly above or below
the corresponding fitted values. The horizontal axis is no
longer needed so "forget it".
FORGET-IT G.E. Dallal
PAGE 3
ROUND-OFF ERROR
The greatest problem in constructing Forget-it Plots on a
discrete printing device is the error introduced by rounding
when placing observed and fitted values in their proper
locations. For example, the points (0,0), (2,2.3), (0,1.4),
(2,3.7) are the vertices of a parallelogram, but this is no
longer true if the the ordinates are rounded to the nearest
integer. Thus, on a discrete plotting device we cannot
simultaneously have
i. all lines of the plot straight,
ii. arbitrary plot size, and
iii. all fitted values rounded to the appropriate
location along the vertical axis.
A compromise is needed: The fitted value for w(1,1) is
rounded into the appropriate location. The quantities
{A(i)-A(1):i=1,...,r}, {B(j)-B(1):j=1,...,c} are rounded to
the appropriate number of units of shift along the vertical
axis. The r lines containing the fitted values of the r rows
are drawn as a set of southwest to northeast line segments of
length B(c)-B(1) starting at ordinates w(1,1)(fitted)-(A(1)-
A(i)) and abscissa A(1)-A(i). The c lines containing the
fitted values of the c columns are drawn as a set of
northwest to southeast line segments of length A(1)-A(r)
starting at ordinates w(1,1)(fitted)+B(j)-B(1) and abscissa
B(j)-B(1).
If the observed values are simply rounded and placed in the
diagram, small residuals can appear with their signs changed.
Hence, the observations are perturbed as follows: The true
residuals are calculated and rounded to the appropriate units
of shift along the vertical axis. The observed values are
then placed this many units from the fitted values as
plotted. While neither the observed nor fitted values can be
determined exactly from their positions along the axis, their
relative locations are correct. This compromise also means
that it is possible for two observed values to have their
relative order reversed in the plot. That is, when comparing
two observed values, the one appearing to be slightly larger
according to its plotting position may, in fact, be slightly
smaller. As the size of the plot increases the chance of
this happening decreases.
FORGET-IT G.E. Dallal
PAGE 4
OPERATION
The program is essentially driven by prompts. The user has
the option of entering data from a file of the form:
Data, one row at a time. A row can span multiple
records and each row need not occupy the same number of
records, but each row must start on a new record. Then
come row labels followed by column labels (up to five
characters each), one per record starting in column one.
If labels are not present, the program will prompt for
them.
REFERENCES
Mosteller, Frederick and John W. Tukey (1977). Data Analysis
and Regression. Reading, MA: Addison-Wesley Publishing
Co.
Tukey, John W. (1970). Exploratory Data Analysis: Volume II.
Limited Preliminary Edition. Reading, MA: Addison-Wesley
Publishing Co.
Tukey, John W. (1977). Exploratory Data Analysis. Reading,
MA: Addison-Wesley Publishing Co.
FORGET-IT G.E. Dallal
PAGE 5
EXAMPLE
The following example involves alpha-tocopherol levels in the
cerebellum of rats fed on one of four diets (SE = selenium; E
= Vitamin E; + = rich; - = deficient):
young old aged
+E+SE 10.600 12.600 13.300
-E+SE 2.7000 8.7000 6.0000
+E-SE 9.6000 13.800 14.200
-E-SE 3.5000 8.5000 2.8000
Enter the size of the plot: 41
GRAND MEAN... 8.8583
ROW EFFECTS...
1: 3.3083 +E+SE
2: -3.0583 -E+SE
3: 3.6750 +E-SE
4: -3.9250 -E-SE
COLUMN EFFECTS...
1: -2.2583 young
2: 2.0417 old
3: 0.21667 aged
ORDER OF DECREASING ROW EFFECTS:
3: 3.6750 +E-SE
1: 3.3083 +E+SE
2: -3.0583 -E+SE
4: -3.9250 -E-SE
ORDER OF INCREASING COLUMN EFFECTS:
1: -2.2583 young
3: 0.21667 aged
2: 2.0417 old
FORGET-IT G.E. Dallal
PAGE 6
14.575 | 0 +E-SE
14.278 | X /|0 +E+SE
13.980 | | / ||\
13.683 | | / /X| \
13.385 | |X/ / | \
13.088 | || / | \
12.790 | 0|/ X \
12.493 | / 0 \
12.195 | / / \ \
11.898 | / / \ \
11.600 | / / \ \
11.303 | / / \ \
11.005 | / / \ \
10.708 | / / \ \
10.410 | X / \ \
10.113 | 0|/ \ \
9.8150 | |0 \ \
9.5175 | X \ \ \
9.2200 | \ \ \
8.9225 | \ \ \ X
8.6250 | \ \ \ | X
8.3275 | \ \ \| |
8.0300 | \ \ 0 | -E+SE
7.7325 | \ \ / \ |
7.4350 | \ \ / \|
7.1375 | \ \ / 0 -E-SE old
6.8400 | \ \ / /
6.5425 | \ \ / /
6.2450 | \ X /
5.9475 | \ / \ /
5.6500 | \ / \ /
5.3525 | \ / 0 aged
5.0550 | \ / /|
4.7575 | \ / / |
4.4600 | \ / / |
4.1625 | \ / / |
3.8650 | \ / / |
3.5675 | 0 X / |
3.2700 | |\ | / |
2.9725 | | \|/ X
2.6750 | X 0 young
FORGET-IT G.E. Dallal