home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!munnari.oz.au!yoyo.aarnet.edu.au!sirius.ucs.adelaide.edu.au!sirius!wvenable
- From: wvenable@algona.stats.adelaide.edu.au (Bill Venables)
- Newsgroups: sci.math.stat
- Subject: Re: Standard Deviation.
- Message-ID: <WVENABLE.92Aug18180002@algona.stats.adelaide.edu.au>
- Date: 18 Aug 92 08:30:02 GMT
- References: <1992Aug14.172833.11844@cbfsb.cb.att.com> <c48nbgtf@csv.warwick.ac.uk>
- Sender: news@ucs.adelaide.edu.au
- Organization: Department of Statistics, University of Adelaide
- Lines: 37
- Nntp-Posting-Host: algona.stats.adelaide.edu.au
- In-reply-to: psrdj@warwick.ac.uk's message of 17 Aug 92 09:35:38 GMT
-
- >>>>> "Glyn" == G M Collis <psrdj@warwick.ac.uk> writes:
-
- Glyn> What intrigues me is that the most elementary stats texts make a big
- Glyn> fuss about using n-1 for an unbiased estimate of the variance, but
- Glyn> ignore the fact that this gives a biased estimate for the SD. I
- Glyn> recall that n - 1.5 is nearer the target for the SD when the sample
- Glyn> is from a normally distributed population. I gather that minimising
- Glyn> the bias when estimating the SD is rather sensitive to the population
- Glyn> distribution - I'd like to know more about this. But my big puzzle
- Glyn> remains - why is the biasedness of the usual SD estimator (with N-1)
- Glyn> so rarely mentioned, in stark contrast to the case of the variance.
-
- What surprises me is how this quaint little thread got going at all. The
- elementary books are wrong if they make a big issue of unbiasedness, period.
-
- In this context the *two* important quantities are (a) the sum of squares,
- since it is the squared length of the orthogonal projection of the
- observation vector onto the residual space, and (b) the degrees of freedom,
- which is the dimension of the residual space. This latter number is
- sometimes n-1, but more often n-p where p is somewhat larger than 1. These
- two quantities, *separately*, are what you need for virtually all
- inferential procedures, like testing and confidence intervals. Whether you
- divide one by the other to give an estimate of the variance is up to you.
- Incidently, if you do, it turns out to be unbiased, but "so what?", really.
-
- In my opinion statistical inference is all about reliably capturing
- information from data (and elsewhere if you are a Bayesian); it's not
- really about coming up with a number from a data set that you can show will
- be "close" to an unknown parameter value, in some special sense of "close".
- The trouble with many elementary books is that they get hung up on a narrow
- definition of "estimation" and elevate unbiasedness to an importance far in
- excess of what is warranted, at the same time not mentioning sufficiency,
- say, a far more important concept, (but harder to describe, of course).
- --
- ___________________________________________________________________________
- Bill Venables, Dept. of Statistics, | Email: venables@stats.adelaide.edu.au
- Univ. of Adelaide, South Australia. | Tel: +61 8 228 5412 Fax: ...232 5670
-