home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!dtix!darwin.sura.net!ukma!rutgers!news.cs.indiana.edu!umn.edu!thompson
- From: thompson@atlas.socsci.umn.edu (T. Scott Thompson)
- Newsgroups: sci.math.stat
- Subject: Re: Fwd: Standard Deviation.
- Message-ID: <thompson.714071949@daphne.socsci.umn.edu>
- Date: 17 Aug 92 17:19:09 GMT
- References: <1992Aug14.172833.11844@cbfsb.cb.att.com> <seX2yRq00Uh785H2EB@andre <1992Aug14.231916.23479@magnus.acs.ohio-state.edu> <1992Aug16.212245.27577@mailhost.ocs.mq.edu.au> <1992Aug16.225926.497@massey.ac.nz>
- Sender: news@news2.cis.umn.edu (Usenet News Administration)
- Reply-To: thompson@atlas.socsci.umn.edu
- Organization: University of Minnesota
- Lines: 61
- Nntp-Posting-Host: daphne.socsci.umn.edu
-
- news@massey.ac.nz (USENET News System) writes:
-
- >In article <1992Aug16.212245.27577@mailhost.ocs.mq.edu.au>, wskelly@laurel.ocs.mq.edu.au (William Skelly) writes:
- >>
- >> This and other posting indicate that there is a relationship between
- >> sample size and and estimated variance (of the population) which is
- >> positive and always an underestimate. What is the limit, or point
- >> at which an increasing sample size no longer improve the estimate
- >> of populations variance?
- >>
- >When the sample size is equal to the population size (never for an
- >infinite population).
-
- Not necessarily true. It depends on the sample design. Suppose that
- we have a finite population of size N and we draw _with_replacement_ a
- sample of size n using independent draws. Then the bias from using
- the "n" denominator is (minus)
-
- <population variance> / n
-
- and this is true for _any_ n, including n = N, or even n = 2N.
-
- >> Can this be tested by taking samples of sample
-
- >Yes but we can work out the theory so it isn't necessary.
-
- >> (the later sample
- >> being elevated to the status of population)?
- >>
-
- Perhaps we can work out the theory for the bias, but this is not so
- clear when we consider other features of the distribution of the
- estimate. See the related questions about the bias of the usual
- SD estimate in this thread for an example.
-
- The intuition in the original comment was right on!
-
- In fact the procedure of taking samples from the original sample,
- "elevating" the original sample to the status of population is
- _exactly_ the definition of (a particular variety of) bootstrap
- resampling. See for example the book _The_Bootstrap_and_Edgeworth_
- _Expansion_, by Peter Hall (Springer-Verlag, 1992), which provides
- many references and examples.
-
- It is shown in the bootstrap literature that for certain purposes
- (e.g. constructing statistical tests when the sample is drawn from a
- non-normal population) the bootstrap procedure can outperform
- (sometimes significantly) the traditional procedures.
-
- So, for example, if the reason you are estimating a variance is to put
- its square root in the denomenator of a t-statistic, then boostrapping
- _may_ be the way to go (unless you really believe in the normality of
- your population). For this application, you really aren't interested
- in whether or not your estimate of the variance is biased, and so the
- n vs. n-1 debate is irrelevant. In fact, the bootstrap based test
- procedure will lead to the same test whether you choose n _or_ n-1, an
- invariance property that I find appealing.
-
- T. Scott Thompson thompson@atlas.socsci.umn.edu
- Dept. of Economics
- University of Minnesota
-