NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / sci / math / stat / 1655 < prev next >

Wrap

Text File | 1992-08-14 | 2.5 KB | 67 lines

Newsgroups: sci.math.stat Path: sparky!uunet!cis.ohio-state.edu!magnus.acs.ohio-state.edu!regeorge From: regeorge@magnus.acs.ohio-state.edu (Robert E George) Subject: Re: Fwd: Standard Deviation. Message-ID: <1992Aug14.231916.23479@magnus.acs.ohio-state.edu> Sender: news@magnus.acs.ohio-state.edu Nntp-Posting-Host: bottom.magnus.acs.ohio-state.edu Organization: The Ohio State University References: <1992Aug14.172833.11844@cbfsb.cb.att.com> <seX2yRq00Uh785H2EB@andre Date: Fri, 14 Aug 1992 23:19:16 GMT Lines: 54 In article <seX2yRq00Uh785H2EB@andrew.cmu.edu> Daniel Read <dr3u+@andrew.cmu.ed u> writes: >---------- Forwarded message begins here ---------- > >Can someone explain why calculating the Standard Deviation (SD), >for small samples, with (n-1) in the denominator is better than >doing so with (n) in the denominator? I'm sure that there's >a perfectly good reason for doing so. But we, lowly engineers >aren't usually told the reason. Thanks now, for your response later.;-) > >RESPONSE (and my own query): > >The variance increases as a function of sample size. That is, a small >sample will systematically underestimate the population variance (if we >estimate the population variance with denominator N-1). Using this >reduced denominator therefore has the effect of increasing the variance >estimate. > >My question: why does a small sample underestimate the population variance? > >daniel read I'm not certain what "underestimate" means here: if you are speaking of bias (bias being the difference between the estimand and the expected value of the estimator), then --- _ _ \ | X - Xbar |2 T = / |_ _| --- ------------------ n will "underestimate" the population variance *for any sample size n*. T will always have a negative bias *regardless of how large a sample is used to compute T*. More intuitively, if we take a very sample, we are less likely to get extreme values and so our notion of what the population variance is (note that I am not proposing some particular estimator) will be unrealistic. For instance, I give an exam to two students. Their scores are 67 and 71. I think, "Gee, there's not a lot of variability in these scores." But then 8 more students take the exam: 60 78 100 38 50 88 99 39 and it now is clear there *is* more variability in these scores. But let me reiterate that T will *always* have a negative bias for the population variance whatever the sample size is Robert George (speaking only for myself)