NetNews Usenet Archive 1992 #16

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #16 / NN_1992_16.iso / spool / bit / listserv / statl / 1200 < prev next >

Wrap

Text File | 1992-07-25 | 26.9 KB | 614 lines

Comments: Gated by NETNEWS@AUVM.AMERICAN.EDU Path: sparky!uunet!paladin.american.edu!auvm!PACEVM.BITNET!ELLEN Message-ID: <STAT-L%92072601201962@VM1.MCGILL.CA> Newsgroups: bit.listserv.stat-l Date: Sun, 26 Jul 1992 01:16:18 EDT Sender: "STATISTICAL CONSULTING" <STAT-L@MCGILL1.BITNET> From: arthur s ellen <ELLEN@PACEVM.BITNET> Subject: Summary of One-Tail Chi Square Discussion Lines: 603 ONE TAIL CHI SQUARE SUMMARY I am summarizing the responses that I received regarding the following question. A question that I answered but one that had given me some second thoughts. A local clinician asked me if he could treat a 2X2 chi square like a one-tailed t-test and just double the alpha to 0.10 rather than 0.05 since he had a directional hypothesis. He reasoned by analogy from a one-tailed t-test. Regarding tailedness of chi-square; my explanation went that: We don't get positive or negative values for chi square, only positive values, so we are only dealing with the right side of the distribution. ======================================================================== #01 Date: Mon, 15 Jun 1992 13:28:58 -0500 From: David.Howell@UVM.EDU One reference on this issue is Howell, D.C. (1992) Statistical Methods for Psychology (3rd edition), PWS-Kent. p. 143. The definition of one- vs. two-tailed tests with Chi-square becomes confusing for the same reason that it does with F. The test IS one-tailed in the sense that we normally only reject for the right tail of the distribution. (As opposed to a t test where we reject for large positive or large negative values of t.) In calculating Chi-square we square (Observed - expected), which effectively ignores the sign of the difference. Therefore the test is (generally) two-tailed (with 1 df) because we will reject if too few males or two many males (as opposed to females) fall in one category. With more than two categories the test becomes many-tailed. There have been suggestions for making the test one-tailed (in the sense of "tailedness" used in this paragraph) by only running the test if the results are directionally in line with the prediction, but I have never actually seen that done. &-&-&-&-&-&-&-&-&-&-&-&-&-&-& & & & David C. Howell & & Dept. of Psychology & & University of Vermont & & Burlington, VT 05405 & & & & David.Howell@uvm.edu & & & &-&-&-&-&-&-&-&-&-&-&-&-&-&-& ======================================================================== #02 Date: Mon, 15 Jun 1992 13:31 EST From: "Sheryl Bass, ARPC Oswego. Tel. (315) 349-0198" <BASS%TECHDB%MISX02@KINGSTON.ARPC.ALCAN.CA> hi, I read your question on the STAT-L list - what are you using the chi-square test for? Sheryl Bass Internet: BASS%Techdb@Oswego.ARPC.Alcan.Ca ======================================================================== #03 Date: Mon, 15 Jun 1992 13:46 EDT From: KAPLON@TOWSONVX.BITNET For what hypothesis?? Goodness-of Fit? Test of Independence? Test about the variance of one population?? For some we do use a one tail test. For some the test is one tail even though the alternative hypothesis is "two tail" because of the arithmetical nature of the test statistic. If you can be more specific, perhaps I can be of more help. ----------------------------------------------------------------------- | Howard S. Kaplon | Mathematics Department | | BITNET: Kaplon-H@TOWSONVX.BITNET | Towson State University | | Internet: Kaplon-H@TOE.TOWSON.EDU | Towson, Maryland 21209-7097 | | Phone: (410) 830-3087 | FAX: (410) 830-2604 | ----------------------------------------------------------------------- Date: Mon, 15 Jun 92 14:22:22 EDT From: Raymond Liedka <RJOY@CORNELLC> A non-technical answer is that it only makes sense to make a one-tailed test. Why? Think of the kinds of values you have ever seen for the Pearson X2 or the likelihood-ratio G2...I would bet my house (if I had one) that the values were always positive. Remember, the chi-square distributioterminates on the left-hand side at a X2 value of 0.... | | * * | * * | * * | * * | * * | * * | * * | * * * | * * |*__________________________________* ****** 0 The distribution continues on out to infinity on the right. Now, what is it we are trying to examine when we use the chi-square test. Essentially, what we are trying to do is examine how close the expected values under some model compare to the observed data. So, the comparison, or test, we are interested in is how large is the discrepancy between the model and the data. If the model and the data are exactly the same, the X2 (and G2) value=0. We are only interested in the right-hand side. Finally, think about the normal distribution. It continues to BOTH positive and negative infinity. When using the distribution, the kinds of hypotheses we try to look at are of the type: mu = 0 Thus, mu could be negative or positive and be significantly different from zero. With the chi-square test, we are interested in testing for X2 > 0, it is inherently a one-tailed test cause the discrepancy between the model and the observed data cannot be negative. Note the formula for the Pearson X2 statistic is: ******** 2 * (observed - expected) * --------------------- * expected ******** i Note that the absolute discrepancy is (observed-expected). While this can be less than zero, it is squared for use, making it impossible for X2 to ever be less than zero. One-tailed test! And remember.... Enjoy!!!!!!!! Raymond V Liedka Department of Sociology Cornell University ======================================================================== 50 #04 Date: Mon, 15 Jun 1992 16:13 EDT From: KAPLON@TOWSONVX.BITNET Using a Chi-Square test with a 2x2 table for the (I assume) Null Hypothesis that the proportions of successes in the two populations are equal is equivalent to using a two-sample Z test on proportions with the same null hypothesis. Note that in both cases the Alternative Hypothesis is the proportions are NOT EQUAL and thus a two-sided test. It can be shown that the Chi-Square statistic is equal to the SQUARE of the Z statistic. Since the Chi-Square is Z-squared, highly significant results will ALWAYS be large POSITIVE. Therefore, while the two-sided Z test uses both tails of the Z distribution, the two-sided Chi-Square test uses only the large POSITIVE values in the upper or right tail. The lower or left tail of the Ch-Square distribution are values near zero, and these values indicate a high agreement between the observed and expected values. Since the expected values are computed under the assumption that the NULL Hypothesis is TRUE, this agreement of the observed and expected values (i.e., Chi-Square vakues near zero) supports the Null Hypothesis and doe NOT lead to rejection of the Null Hypothesis. However, the advantage of the two sample Z test is that one may specify a one sided alternative and thus do a one tail test. BUT since the Chi-Square statistic equals Z-squared, the Chi-Square test on the 2x2 table may NOT be specified as one-sided. ----------------------------------------------------------------------- | Howard S. Kaplon | Mathematics Department | | BITNET: Kaplon-H@TOWSONVX.BITNET | Towson State University | | Internet: Kaplon-H@TOE.TOWSON.EDU | Towson, Maryland 21209-7097 | | Phone: (410) 830-3087 | FAX: (410) 830-2604 | ----------------------------------------------------------------------- ============================================================================== Date: Mon, 15 Jun 92 20:01:16 CDT > Can someone provide a succint an easy explanation and/or a > reference as to why we don't use a one tailed chi square test. > Because about the only time that a one-tail test makes any sense is for the 1 degree of freedom test. The convention is to do the test as a z test by taking the square root of the chi square and attaching the appropirate sign. J. Philip Miller, Professor, Division of Biostatistics, Box 8067 Washington University Medical School, St. Louis MO 63110 phil@wubios.WUstl.edu - Internet (314) 362-3617 [362-2694(FAX)] ============================================================================== #06 Date: Mon, 15 Jun 92 22:54 EDT From: "Dennis Roberts" <DMR@PSUVM> In some cases, the test statistic (chi square for example) is such that we would reject the null if the chi square value is LARGER or SMALLER than some critical point(s). For example, testing the hypothesis about a population variance is a chi square test where the numerator is the df value times the sample variance and the denominator is the hypothesized population variance. If the true population variance is NOT what you hypothesize, then the chi square value can be LARGER THAN df OR SMALLER THAN df, depending whether the true population variance is larger or smaller than you hypothesize. However, in a case like using chisquare to make a goodness-of-fit test, the closer the calculated chis square is to 0, the less discrepancy there is between expectation and observation; a condition that means RETAINING the null hypothesis. ONly large chi square values GREATER THAN 0 put you in the position of wanting to reject the null. Thus, thefirst example is a two tail test (using the chi square distribution) whereas th e second example is a 1 tail test. IT IS NOT THE DISTRIBUTION THAT DETERMINES WHETHER IT IS ONE OR TWO TAILS, BUT THE SPECIFIC TEST STATISTIC. For chi square TESTS, they can be one or two tail. So can F tests, etc. =============================================================================== #07 Date: Mon, 15 Jun 1992 14:49:07 CST From: EJOHNSON@CMSUVMB.BITNET Can someone provide a succint an easy explanation and/or a reference as to why we don't use a one tailed chi square test. A concise answer is that there is only one tail to consider since it is impossible to get a negative chi square score. -- Ed Johnson, CMSU ======================================================================== #08 Date: Tue, 16 Jun 92 11:30:23 +0100 From: mff@ukc.ac.uk Most uses of chi squared tests on contingency tables, or for testing goodness of fit, __do__ use a one tailed rejection region. Do you have some other kind of chi square(d) test context in mind? There are dangers in doing contingency table tests this way. An unusually small chi square statistic, suggesting almost exact agreement between the theory generating the expected frequencies and the corresponding observed frequencies, could raise suspicions of fraud - the results are too good to be true. Modern writers have wondered whether the father of genetics, Mendel, may have been over zealously "helped" by his assistant in the monastery garden, as some of his cross breeding experiments with plants fall into this category. I do not have a reference at my fingertips, so hope one of your other respondents does. Mike Fuller - statistician marooned in Canterbury Business School, University of Kent, Canterbury, Kent, CT2 7PD, England (email: mff@ukc.ac.uk - in UK mff@uk.ac.ukc) ======================================================================== #09 Date: 16 Jun 92 20:54:09 U From: "dick darlington" <dick_darlington@qmrelay.mail.cornell.edu> Reply to: One-tailed chi-square#000# I assume you are talking about tests with a directional prediction in a 1 x 2 or 2 x 2 chi-square, with the one-tailed p found by dividing the usual p by 2. I don't think there is any good reason for avoiding this unless one opposes all one-tailed tests (there are such people). I present it as a standard method in "Behavioral Statistics" (Free Press 1987), though as you say it's not widely used. Dick Darlington, Psychology, Cornell#000# ======================================================================== #10 Date: Thu, 25 Jun 1992 17:31:35 EDT From: "Karl L. Wuensch" <PSWUENSC@ECUVM1.BITNET> If using chi-square to test the null that a population variance is of a specified value, one does use a two-tailed test. To test the null that the variance is less than or equal to (or greater than or equal to), one uses a one-tailed test. In its most common application, the Pearson chi-square test for independence in a two-way contingency table, the nondirectional hypothesis is appropriately tested with an one-tailed, upper-tailed, test -- why? -- well, regardless of the direction in which your expected frequences differ from the observed, the greater the magnitude of such differences the greater the chi-squared. The F used to test nondirectional hypotheses in ANOVA is another example of the appropriate use a one-tailed test for nondirectional hypotheses. What to do if your hypotheses are directional? Suppose your alternative hypotheses is mu1 > mu2 > mu3. Compute the usual F, obtain its upper-tailed p and divide by 3 factorial (the number of ways in which the means could have been ordered). I assume that the results confirm the hypothesized ordering of the means, if not, you must retain the null in any case. I suppose we could call this test a "one-sixth tailed test." Karl L. Wuensch, Dept. of Psychology, East Carolina Univ. Greenville, NC 27858-4353, PSWUENSC AT ECUVM1 (BITNET) ======================================================================== #11 Date: Mon, 29 Jun 1992 19:13:47 GMT From: Jerry Dallal <jerry@NUTMEG.HNRC.TUFTS.EDU> Anyone who wants one can have a 2-tailed chi-square test. The question is, "Do you want one?" First, some clarification. Just what do you mean by a 2-tailed chi-square test? Let's assume it's for independence in a two-fold table. (Generalization to other situations should be clear.) Is the test to be two-fold in terms of some null hypothesis, e.g., the row categories are associated with the column catagories? In this case the, usual chi-square statistic is already 2-tailed since large values of the statistic occur for either a positive or negative association between the categories. Is the test to be 2-tailed in terms of the distribution of some test statistic? If the test statistic is the usual goodness-of-fit statistic, then rejecting the statistic for small values is a statement that the data were too close to their expected values, that is, they were too good to be true! This sort of test might be appropriate if someone were suspected of cheating or fabricating data. I recall, but I can't give a reference, of hearing that this sort of analysis was applied to some of Mendel's data because some of his data were in such close agreement with his theories. This 2-tailed chi-square test would be analogous to a test of a normal mean of 0 that rejected for |z|> 2.414 or |z|<.0313 . ======================================================================== #12 Date: Mon, 29 Jun 1992 16:31:00 EST From: "Philip Gallagher,(919)966-7275" <UPHILG@UNC.BITNET> All this talk of two-tailed Chi-sq tests has stirred a VERY dim memory, and trying to remember the rest of it is driving me daffy. Is there anyone out there who remembers/can think of a connotation, probably in a lower level course, where it made sense for the teacher to have talked about the upper 95% of the chi-sq distn? Possibly in talking about noncentral distributions? I can visualize the picture on the blackboard, but I cannot think of the application. Phil Gallagher ======================================================================== #13 Date: Mon, 29 Jun 1992 14:58:48 MDT From: vokey@HG.ULETH.CA Jerry Dallal notes: "If the test statistic is the usual goodness-of-fit statistic, then rejecting the statistic for small values is a statement that the data were too close to their expected values, that is, they were too good to be true! This sort of test might be appropriate if someone were suspected of cheating or fabricating data." Aside from cheating, another commonplace use of the bounded chi-square tail (i.e., close to zero) is in the testing of pseudo-RNGs (random number generators) where the problem of too good of a fit means the RNG is NFG! John R. Vokey <vokey@hg.uleth.ca> ======================================================================== #14 Date: Fri, 26 Jun 1992 11:18:34 GMT From: Ronan M Conroy <RCONROY@IRLEARN.UCD.IE> The nub of the matter is that the pearson chi-sq test, like the F ratio, tests the ability of a model to predict the observed data. The expected frequencies in the table being tested are the ones predicted by the null hypothesis model, which says that the observed frequencies are simply a product of the total number of cases in the table and the marginal proportions. Lack of fit between expected and observed is counted regardless of whether the model over- or underestimated the number of cases in the cell, because the hypothesis being tested is non-directional: it says that you can predict cell frequencies, near as dammit, using marginal proportions. When you think of it, since the table's total number of cases is used to make individual cell predictions, the null hypothesis model will neither over- nor under-predict the cell frequencies over the whole table (sum of expected must equal sum of observed!) so you cannot have a hypothesis that says 'The marginal proportions and total N predict a higher number/lower number of cases than are actually observed in the table.) Blast it! Simple ideas are so haaaaaaaaaaaaard to explain. Forgive me for having a go, though; I enjoy it if no-one else does. =============================================================================== #15 Date: Mon, 29 Jun 1992 16:43:46 U From: dick darlington <dick_darlington@QMRELAY.MAIL.CORNELL.EDU> 4:28 PM OFFICE MEMO Time: Subject: 1-tailed chi-square#000# 06-29-92 Date: In a 2 x 2 chi-square test in which you correctly predicted the direction of the result, I consider it perfectly acceptable to divide the tabled p by 2 to get a one-tailed value. The Fisher 2 x 2 test tests the same null hypothesis (ignoring fine points about fixed versus random marginals), and the Yates correction for continuity is justified largely on the ground that it makes the p from chi-square closely approximate the Fisher p. But the p that does this is _half_ the p from the chi-square table. Dick Darlington, Psychology, Cornell ======================================================================== #16 Date: Mon, 29 Jun 1992 16:51:59 U From: dick darlington <dick_darlington@QMRELAY.MAIL.CORNELL.EDU> 4:38 PM OFFICE MEMO Time: Subject: Left tail of chi-square#000# 06-29-92 Date: In response to Philip Gallagher's question: Let V denote the observed residual variance in a regression, and you want to find an upper confidence limit on the true residual variance. You then find yourself asking about the probability that observed V could have been so small if the true variance were some specified value. In other words, you're working with the left tail ofthe chi-square distribution. Dick Darlington, Psychology, Cornell#000# ======================================================================== #17 Date: Tue, 30 Jun 1992 12:06:52 CET From: Joop Hox <A716HOX@HASARA11.BITNET> Well, if you are exploratively fitting lisrel or glim models and you end up with a chi-square for the model fit with a p-value of .99 or so, you could argue that is a proof of significantly over-fitting your model. It simply fits too well, like Mendel's pea data. Joop Hox University of Amsterdam ======================================================================== #18 Date: Tue, 30 Jun 1992 10:01:00 U From: dick darlington <dick_darlington@QMRELAY.MAIL.CORNELL.EDU> 9:17 AM OFFICE MEMO Time: Subject: Left tail of chi-square#000# 06-30-92 Date: Barrie Robinson's point is well taken: in my message on the left tail of the chi-square distribution, I didn't focus on the current topic, which is goodness-of-fit chi-square tests. Let me try another (entirely hypothetical) example. An archaeologist finds a crypt containing 240 funeral urns from an ancient civilization whose writing we can read. Each urn indicates the gender of its occupant. The urns are not in pairs, and are not segregated by gender, but there are exactly 120 of each gender. To show that this equality must have been intentional, if we used a chi-square goodness-of-fit test we would need to show that even if the gender ratio was 1.0 in that society, the probability is small that the two groups would be so equal. For a test of association, suppose we observed that in a large company, almost exactly 12% of the employees in every division were black. A test for association between division and race, using the left tail of the chi-square distribution, could show that this equality of proportion was greater than would be expected by chance. I'm not commenting on the ethical or legal implications of that finding, I'm just saying that's how the H would be tested. In both my examples and the "cheating scientist" example mentioned earlier, the H1 being corroborated by a significant result is some kind of unnatural (in these examples human) intervention. We might imagine examples not involving human or even animal behavior, in which a significant result demonstrates some sort of homeostatic mechanism that keeps something even. Dick Darlington, Psychology, Cornell ======================================================================== #18 Date: Tue, 30 Jun 1992 10:41:00 EST From: PLOCH@UTKVX.BITNET Jerry Dallal writes: "If the test statistic is the usual goodness-of-fit statistic,then rejecting the statistic for small values is a statement that the data weretooclose to their expected values, that is, they were too good to be true! This sort of test might be appropriate if someone were suspected of cheating or fabricating data." I no longer have the reference but Finklestein writing about judges/lawyers useofstatistics mentions a case in South Carolina in which the expected proportions ofblacks on series of petit juries could have been rejected because there was notenough variation for random assignment. The proportion black required 1.7 blacks per jury. Jurries had either 1 or 2 blacks and the average number of blacks neverstrayed far from 1.7. The Supreme Court did not use statistics to declare theselection method unconstitutional, instead they found that prospective white jurorshad their names placed on white slips of paper; balck on yellow slips. Clearerevidence of cheating than a chi square test. Don Ploch PLOCH@UTKVX PLOCH@UTKVX.UTK.EDU ======================================================================== #19 Date: Tue, 30 Jun 1992 09:46:58 +1200 From: brobinso@LEVEN.APPCOMP.UTAS.EDU.AU In this discussion of how many tails a chi-square test can have, one fact appears to be being overlooked: the chi-square test is an approximation (in some sense) to the true distribution, which is the multinomial distribution. The null hypothesis might be p1 = p10, p2 = p20, . . ., pk = pk0, where p10+p20+ . . . + pk0=1. The alternative is any other set of pi's that add up to 1, and is hence "multi-tailed". In the case of a contingency table, H0 may be stated in the form, p11:p12:p13: . . . :p1c = p21:p22:p23: . . . :p2c = p31: . . . =pr1:pr2:pr3: . . . :prc, for r rows and c columns. The alternative may be any other distribution of p's. In the case of a 2X2 table, there are really only 2 alternatives, namely, p11:p12>p21:p22, and p11:p12<p21:p22, hence if the true distribution, (which would be quadrinomial?) is considered, both tails could be separately considered, and the power of the test be evaluated. The nature of the "approximation" in the so-called chi-square test, like that in the log-likelihood-ratio test, which is also asymptotically distributed as chi-square, is such that all the information about different tails is lost. Or so it seems to me, anyway. Barrie. -- Barrie Robinson, |email: brobinso@leven.appcomp.utas.edu.au University of Tasmania at Launceston. |phone: (61)(003)260211 ======================================================================== #20 Date: Tue, 30 Jun 1992 09:58:31 +1200 From: brobinso@LEVEN.APPCOMP.UTAS.EDU.AU dick darlington says: > Subject: > Left tail of chi-square#000# 06-29-92 > Date: > >In response to Philip Gallagher's question: Let V denote the observed residual >variance in a regression, and you want to find an upper confidence limit on the >true residual variance. You then find yourself asking about the probability >that observed V could have been so small if the true variance were some >specified value. In other words, you're working with the left tail ofthe >chi-square distribution. >Dick Darlington, Psychology, Cornell#000# > This is fine, but most of us (and probably Philip Gallagher) were talking about the goodness of fit test, which is quite different. I don't know where non-central distributions fit in, or much else about them. Don't they have something to do with the true distribution when the null hypothesis fails (according to a certain model)? Barrie. -- Barrie Robinson, |email: brobinso@leven.appcomp.utas.edu.au University of Tasmania at Launceston. |phone: (61)(003)260211 ======================================================================== Thanks to all those who responded, ------------------------------------------------------ ║ ART ELLEN ║ PSYCHOLOGY DEPARTMENT ║ ║ BITNET: ELLEN@PACEVM ║ PACE UNVERSITY ║ ║ VOICE: (212) 346-1506 ║ 41 PARK ROW ║ ║ ║ NEW YORK, NY 10038-1502 ║ ------------------------------------------------------