home *** CD-ROM | disk | FTP | other *** search
- Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!malgudi.oar.net!news.ysu.edu!psuvm!auvm!UNC.BITNET!UPHILG
- Message-ID: <STAT-L%93010516070081@VM1.MCGILL.CA>
- Newsgroups: bit.listserv.stat-l
- Date: Tue, 5 Jan 1993 13:38:00 EST
- Sender: STATISTICAL CONSULTING <STAT-L@MCGILL1.BITNET>
- From: "Philip Gallagher,(919)966-1065" <UPHILG@UNC.BITNET>
- Subject: Spreadsheet Software for statistical computing
- Lines: 63
-
- Dr. Arday wrote that he doesn't believe that the formulae he
- uses from Fleiss and from Kleinbaum, et al., are "wrong", and
- that the professionals who warn against using naive implementations
- of the texbook formulae (in spreadsheet packages) need to
- substantiate their claims.
-
- I think Dr. Arday didn't get the right intelligence (in the military
- sense) from the information in Phil Miller's note. Perhaps it is
- clearer to say that
- ... the textbook formulae are right, but they give wrong
- answers in many implementations, especially implementations
- on digital computers. ...
-
- Even in hand calculations where one has (in principle!) complete
- control over the number of decimal places, naive implementation
- of the textbook formulae can lead to wrong answers in many
- situations; one (of which I am sure you are already familiar)
- is when some of the variables range, say, from 1 to 10, and others
- range from 1,000,000,000,000 to 1,000,000,000,010. Differences
- and proportions (which look great in the closed form formula) can
- cause instant "... can't tell the difference between zero and a
- very small number ..." problems. In first semester courses one is
- taught to "center and scale" such variables into common ranges
- before hauling out the cookbook formula.
-
- I am not sufficiently versed in the details of statistical computing
- to spell out in detail the horrendously more complex problems one
- encounters in many of the more sophisticated matrix manipulations
- (although I have seen Ron Helms write many of them out on the
- blackboard), but I am sure that someone on the list can point to
- one of the better texts on statistical computing. Just the
- choice of WHICH matrix decomposition routine to use for a particular
- task often requires a high-powered consult. (I perceive this as
- a problem quite different from the "approximate" formulae
- Dr. Arday wrote of.)
-
- I close with an inadequate reference to work done by two (or more,
- perhaps) folks in the Washington, D.C., area - one at WESTAT and
- his colleague at ?Census?Labor Statistics? in the early 80s.
- For at least two years in a row (I know, because I attended their
- practice presentations at the Washington Statistical Society) they
- evaluated something like 20-30 "stat packages" for presentation
- at the annual Joint Meetings of the ASA, etc. For some of the
- ill-conditioned data (see the Longley data in the SAS sample
- library, for example) they found that half the packages couldn't
- achieve even two digits of precision. Which didn't stop the
- packages from reporting many more, as if they were correct. One
- of the packages couldn't even get the first digit right.
-
- The lesson here is that, unless one is using carefully chosen
- textbook datasets with low collinearity and "nice" values, the
- implementations of the textbook formulae are not immaterial.
- If an investigator wishes to avoid the work of learning about
- the complexities of statistical computing, then perhaps he or she
- could consider listening to those who have done their years of
- homework. Hardly anyone can be an expert in everything, after all.
- I suppose this sounds like a flame, although it isn't intended that
- way; I write only because I feel it would be unprofessional to
- permit misconceptions to go without rebuttal.
-
-
- Phil Gallagher
- uphilg@unc
-