Kruskal-Wallis


Kruskal-Wallis single factor analysis of variance by ranks

The analysis of variance (often termed ANOVA or AOV) is a technique used to test multi-sample hypotheses whereby a variable (the mean) is measured from three or more samples. In this case, inter-group differences in data are subject to a non-parametric test whereby no population parameters or sample statistics are stated or used in the test calculations.

The parametric equivalent to this test is the one-way ANOVA test.

As may be expected from a non-parametric test, the sample data is not required to be from mormal distributions and the sample variances may be heterogeneous. As with the parametric ANOVA test equivalent, rejection of a null hypothesis in such a test will only provide an indication that there is at least one difference between the groups (i.e., samples, or 'k') but no information about which groups differ from which other groups.

Script operation

This tool operates in much the same way as most of the others with no specific departures from the usual methods needed.

Click here for information about general script usage.

Raw sample data must be entered as multiple samples, the data being arranged in columns. Follow the example shown below and note that the statistics are computed for the five example columns and these have been labelled in the output under the headings 'A', 'B', 'C', etc. Notice how the original samples were labelled in the same manner and that these titles have been incorporated in the output by inclusion in the input data range.

Note in the example that all sample data is output to the spreadsheet after being both sorted on the basis of size and corresponding rank.

 Raw data:             Spreadsheet output:

  A   B   C   D   E    Kruskal-Wallis H Test
 68  72  60  48  64    Sorted Data
 72  53  82  61  65     	  A       B       C       D       E
 77  63  64  57  70     	 42      48      60      48      53
 42  53  75  64  68     	 53      53      64      50      64
 53  48  72  50  53     	 68      53      72      57      65
				 72      63      75      61      68
				 77      72      82      64      70

		       Table of Ranks for Sorted Data
				  1     2.5      10     2.5     6.5
				6.5     6.5      14       4      14
			       17.5     6.5      21       9      16
				 21      12      23      11    17.5
				 24      21      25      14      19


		       H Statistic:            6.44
		       No. of Cases:    	 25
		       d.f.(v): 		  4
		       Sample Size allows Chi-Square Test:
		       P(x²<=H):	    0.83139
		       Chi-Critical(95%):    9.4877
		       Chi-Critical(99%):   13.2767

Interpretation

There are two common ways in which the results of this test may be interpretated. Both function by comparison of the H statistic generated with critical values derived from distribution tables. If the number of samples is less than five then the the H statistic is compared, at a given level of significance, with a critical value derived from a table of such values of the H distribution. If there are more than five samples being compared for significant differences, or the sample sizes are fairly large then the H statistic may be considered to be approximated by the the chi- square (x²) distribution with k - 1 degrees of freedom (where k is the number of samples).

The first method will require access to a table of H distributions published in most statistical analysis texts.

In order to illustrate this process, the comparison of the H statistic with critical values of the x² distribution will be employed as this is likely to be the most useful and more widely employed method. In the example provided, the H statistic is determined as 6.44 and the approximated degrees of freedom is 4 (i.e., k - 1 = 5 - 1 = 4). In the output (above) it can be seen that the critical value at the 0.05 level of significance is 9.4877. As the critical value is not exceeded by the test statistic value the null hypothesis that there is no significant difference between the samples may be retained.



Back to Main Document