NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / sci / math / stat / 1672 < prev next >

Wrap

Text File | 1992-08-17 | 2.6 KB | 52 lines

Newsgroups: sci.math.stat Path: sparky!uunet!cs.utexas.edu!torn!news.ccs.queensu.ca!mast.queensu.ca!dmurdoch From: dmurdoch@mast.queensu.ca (Duncan Murdoch) Subject: Re: Standard Deviation. Message-ID: <dmurdoch.43.714062116@mast.queensu.ca> Keywords: (n) versus (n-1) Lines: 39 Sender: news@knot.ccs.queensu.ca (Netnews control) Organization: Queen's University References: <1992Aug14.172833.11844@cbfsb.cb.att.com> <c48nbgtf@csv.warwick.ac.uk> Date: Mon, 17 Aug 1992 14:35:16 GMT In article <c48nbgtf@csv.warwick.ac.uk> psrdj@warwick.ac.uk (G M Collis) writes: >What intrigues me is that the most elementary stats texts make a big >fuss about using n-1 for an unbiased estimate of the variance, but ignore >the fact that this gives a biased estimate for the SD. I recall >that n - 1.5 is nearer the target for the SD when the sample is >from a normally distributed population. I gather that minimising >the bias when estimating the SD is rather sensitive to the population >distribution - I'd like to know more about this. But my big puzzle >remains - why is the biasedness of the usual SD estimator (with N-1) >so rarely mentioned, in stark contrast to the case of the variance. I think it's just a tradition - introductory statistics texts aren't supposed to explain things correctly, they're just supposed to present a cookbook of methods, with handwaving and often incorrect justifications. As others have pointed out, getting an unbiased variance estimate is hardly essential. I have trouble even thinking of a rigged artificial example where bias of the variance estimate would matter. In many cases, the standard deviation estimate is just used to give a rough idea of the variability of a population, e.g. mean +/- s.d., or the precision of an estimate, e.g. mean +/- s.e. In these cases it almost never matters whether you use N or N-1 --- they give essentially the same answer, unless N is very small. The other place to use a standard deviation is in the construction of confidence intervals or tests. There the denominator doesn't matter a bit: the formula for the CI or test statistic will compensate. I've argued so far that it doesn't matter whether you use N or N-1. So, which do I use? Generally N-1, because it tends to make the Normal theory formulas simpler, e.g. an F test based on the ratio of two variance estimates depends on the degrees of freedom, not the sample size. Make it a general rule in doing variance estimates to divide by degrees of freedom, not sample size, and you'll find you have simpler formulas to remember. Duncan Murdoch dmurdoch@mast.queensu.ca