home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: sci.math.stat
- Path: sparky!uunet!comp.vuw.ac.nz!cc-server4.massey.ac.nz!TMoore@massey.ac.nz
- From: news@massey.ac.nz (USENET News System)
- Subject: Re: Adjusted r-squared as analogous to sample variance estimator
- Message-ID: <1992Aug21.003148.124@massey.ac.nz>
- Organization: Massey University
- References: <Aug20.212648.51218@yuma.ACNS.ColoState.EDU>
- Date: Fri, 21 Aug 92 00:31:48 GMT
- Lines: 40
-
- In article <Aug20.212648.51218@yuma.ACNS.ColoState.EDU>, mglacy@lamar.ColoState.EDU writes:
- >
- > After following the thread about the approporiate divisor for the
- > sample variance, and the comments about the misconceptions and
- > mistakes concerning it that are propogated in introductory textbooks,
- > I was struck by an inconsistency. While most intro. books make
- > considerable noise about the need to use n-1 to make the sample
- > variance an unbiased estimator, virtually none of them consider
- > the analogous correction to make the sample r-squared an unbiased
- > estimator of the population parameter.
- >
- > Does anyone have any explanation for this inconsistency, other
- > than simple error on the part of textbook writers?
- >
- It's not a case of making considerable noise, but of pointing out why
- the "obvious" divisor is not used. It is not important anyway if only a
- point estimate is required. But, for inference, the t-distribution was
- calculated on the basis of the df as the divisor, so, unless you want to
- use non-standard tables, you have to stick to that.
-
- It is also convenient to have the trouble at this point rather than when
- studying regression. Otherwise there would be an inconsistency: use
- n for single samples, n-p when fitting p-location parameters.
-
- Having said that, Moore & McCabe "Introduction to the Practice of
- Statistics" p199, do use n-1 as the divisor when calculating r - strictly
- speaking, when calculating the covariance. (They just don't make
- "considerable noise" about it). (Also true of most books).
- The t-test statistic for testing rho = 0, r sqrt((n-2)/(1-r^2)) relies on
- this definition of r. Of course the n-1 cancels when the correlation is
- written in terms of sums of squares and sums of products -
- (sum (x-xbar)(y-ybar)/(sqrt(sum (x-xbar) sum(y-ybar))) -
- so you may not have noticed that it _was_ incorporated into the
- formula.
-
- Incidently, at least one book on sampling theory uses n as a divisor
- because it leads to simpler formulae when the finite sampling
- correction is taken into account.
-
- Terry Moore
-