NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / sci / math / stat / 1719 < prev next >

Wrap

Text File | 1992-08-20 | 2.4 KB | 51 lines

Newsgroups: sci.math.stat Path: sparky!uunet!comp.vuw.ac.nz!cc-server4.massey.ac.nz!TMoore@massey.ac.nz From: news@massey.ac.nz (USENET News System) Subject: Re: Adjusted r-squared as analogous to sample variance estimator Message-ID: <1992Aug21.003148.124@massey.ac.nz> Organization: Massey University References: <Aug20.212648.51218@yuma.ACNS.ColoState.EDU> Date: Fri, 21 Aug 92 00:31:48 GMT Lines: 40 In article <Aug20.212648.51218@yuma.ACNS.ColoState.EDU>, mglacy@lamar.ColoState.EDU writes: > > After following the thread about the approporiate divisor for the > sample variance, and the comments about the misconceptions and > mistakes concerning it that are propogated in introductory textbooks, > I was struck by an inconsistency. While most intro. books make > considerable noise about the need to use n-1 to make the sample > variance an unbiased estimator, virtually none of them consider > the analogous correction to make the sample r-squared an unbiased > estimator of the population parameter. > > Does anyone have any explanation for this inconsistency, other > than simple error on the part of textbook writers? > It's not a case of making considerable noise, but of pointing out why the "obvious" divisor is not used. It is not important anyway if only a point estimate is required. But, for inference, the t-distribution was calculated on the basis of the df as the divisor, so, unless you want to use non-standard tables, you have to stick to that. It is also convenient to have the trouble at this point rather than when studying regression. Otherwise there would be an inconsistency: use n for single samples, n-p when fitting p-location parameters. Having said that, Moore & McCabe "Introduction to the Practice of Statistics" p199, do use n-1 as the divisor when calculating r - strictly speaking, when calculating the covariance. (They just don't make "considerable noise" about it). (Also true of most books). The t-test statistic for testing rho = 0, r sqrt((n-2)/(1-r^2)) relies on this definition of r. Of course the n-1 cancels when the correlation is written in terms of sums of squares and sums of products - (sum (x-xbar)(y-ybar)/(sqrt(sum (x-xbar) sum(y-ybar))) - so you may not have noticed that it _was_ incorporated into the formula. Incidently, at least one book on sampling theory uses n as a divisor because it leads to simpler formulae when the finite sampling correction is taken into account. Terry Moore