home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!usc!rpi!bu.edu!jade.tufts.edu!news.tufts.edu!sage.hnrc.tufts.edu!jerry
- From: jerry@ginger.hnrc.tufts.edu (Jerry Dallal)
- Newsgroups: sci.math.stat
- Subject: Re: Fwd: Standard Deviation.
- Message-ID: <1992Aug15.172908.298@ginger.hnrc.tufts.edu>
- Date: 15 Aug 92 22:29:07 GMT
- References: <1992Aug14.172833.11844@cbfsb.cb.att.com> <seX2yRq00Uh785H2EB@andrew.cmu.edu>
- Organization: USDA HNRC at Tufts University
- Lines: 31
-
- In article <seX2yRq00Uh785H2EB@andrew.cmu.edu>, dr3u+@andrew.cmu.edu (Daniel Read) writes:
- >
- > Can someone explain why calculating the Standard Deviation (SD),
- > for small samples, with (n-1) in the denominator is better than
- > doing so with (n) in the denominator?
-
-
- I'm totally confused by some of the answers to this question.
- Using n-1 in the denominator gives an unbiased estimate of the variance.
-
- E (sum[(xi-xbar)**2]) = E (sum[(xi-mu)**2]) - E(sum[(mu-xbar)**2])
- = E (sum[(xi-mu)**2]) - nE[(mu-xbar)**2]
- = n sigma**2 - n sigma**2 / n
- = (n-1) sigma**2
-
- Of course, there's nothing magic about unbaisedness. You might prefer a
- maximum likelihood estimate (n in the denominator) or a minimum mse estimator
- (n+1 in the denominator). The reason n-1 is used is that you have to use
- *something* and n-1 makes the cleanest formulas.
-
- For example, Lindgren and Friedman, Pisani, and Purves define their sd with an
- n in the denominator. This means that their formula for CIs involving the t
- distribution becomes
-
- xbar +- c s / sqrt(n-1)
- instead of the usual
- xbar +- c s / sqrt(n)
-
- (Note: All of this assumes a whole bunch of qualifications and conditions that
- I haven't stated explicitly, beginning with the fact that the variance should
- exist!)
-