home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: sci.math.stat
- Path: sparky!uunet!cis.ohio-state.edu!magnus.acs.ohio-state.edu!regeorge
- From: regeorge@magnus.acs.ohio-state.edu (Robert E George)
- Subject: Re: Fwd: Standard Deviation.
- Message-ID: <1992Aug14.231916.23479@magnus.acs.ohio-state.edu>
- Sender: news@magnus.acs.ohio-state.edu
- Nntp-Posting-Host: bottom.magnus.acs.ohio-state.edu
- Organization: The Ohio State University
- References: <1992Aug14.172833.11844@cbfsb.cb.att.com> <seX2yRq00Uh785H2EB@andre
- Date: Fri, 14 Aug 1992 23:19:16 GMT
- Lines: 54
-
-
-
- In article <seX2yRq00Uh785H2EB@andrew.cmu.edu> Daniel Read <dr3u+@andrew.cmu.ed
- u> writes:
- >---------- Forwarded message begins here ----------
- >
- >Can someone explain why calculating the Standard Deviation (SD),
- >for small samples, with (n-1) in the denominator is better than
- >doing so with (n) in the denominator? I'm sure that there's
- >a perfectly good reason for doing so. But we, lowly engineers
- >aren't usually told the reason. Thanks now, for your response later.;-)
- >
- >RESPONSE (and my own query):
- >
- >The variance increases as a function of sample size. That is, a small
- >sample will systematically underestimate the population variance (if we
- >estimate the population variance with denominator N-1). Using this
- >reduced denominator therefore has the effect of increasing the variance
- >estimate.
- >
- >My question: why does a small sample underestimate the population variance?
- >
- >daniel read
-
- I'm not certain what "underestimate" means here: if you are speaking of
- bias (bias being the difference between the estimand and the expected
- value of the estimator), then
-
- --- _ _
- \ | X - Xbar |2
- T = / |_ _|
- ---
- ------------------
- n
-
- will "underestimate" the population variance *for any sample size n*.
- T will always have a negative bias *regardless of how large a sample is
- used to compute T*.
-
- More intuitively, if we take a very sample, we are less likely to get
- extreme values and so our notion of what the population variance is (note
- that I am not proposing some particular estimator) will be unrealistic.
- For instance, I give an exam to two students. Their scores are 67 and
- 71. I think, "Gee, there's not a lot of variability in these scores."
- But then 8 more students take the exam:
- 60 78 100 38 50 88 99 39
-
- and it now is clear there *is* more variability in these scores.
-
- But let me reiterate that T will *always* have a negative bias for the
- population variance whatever the sample size is
-
- Robert George
- (speaking only for myself)
-