Wilcoxon Test


Wilcoxon paired sample or signed-ranks test

This tool is based on detection of statistically significant differences in two paired data samples. If sample data is paired then this test should be used in preference to the Mann-Whitney U test when choosing a non-parametric test as the latter may result in a higher probability of commiting a Type II error (i.e., failure to reject a false null hypothesis).

This test does not deal with the original data measurements but rather the differences within each pair of measurements taken from the samples. It has an underlying assumption that the data population represented by the samples is symmetrical around the median value.

The parametric equivalent is the Matched-pair t-test test.

Script operation

This tool operates in much the same way as most of the others with no specific departures from the usual methods needed.

Click here for information about general script usage.

Raw sample data is entered in two columns as shown. If sample data columns contain titles this can be reproduced in the output by including these in the input data range.

Note in the example that all sample data is output to the spreadsheet after being both sorted on the basis of difference, absolute values and corresponding rank.

 Raw data:          Spreadsheet output:

    X      Y        Wilcoxon Matched-Pairs Signed Ranks Test
   15     19
   19     30        (X-Y)     Sorted ABS Values     Rank
   31     26
   36      8           -4       	      0       --
   10     10          -11       	      2      1.5
   11      6    	5       	      2      1.5
   19     17           28       	      4       -3
   15     13    	0       	      5      4.5
   10     22    	5       	      5      4.5
   16      8    	2       	      8        6
			2       	     11       -7
		      -12       	     12       -8
			8       	     28        9

		    Sum of Neg. Ranks:  	      18
		    Sum of Pos. Ranks:  	      27
		    Wilcoxon T: 		      18
		    Ranked Obs.(n):     	       9
		    z:  			 -0.5331
		    Prob. (Z<=z):       	0.296977

Interpretation

There are two ways in which the output from the script may be interpretated for the purpose of drawing conclusions. These are based on the two sample sizes used. Either method may be used for the same purpose but the second, normal approximation method, is often used where raw data consists of more than 100 pairs (i.e., beyond the limits of a normal Wilcoxon T-distribution critical values table).

When larger sample sizes are available the Wilcoxon T-distribution approaches the normal distribution. In effect this then means that a standardized value of z may be calculated and compared with critical values of the t-distribution, with degrees of freedom of infinity, which is identical to the normal distribution. The reason for this, although beyond the scope of this Guide, can be found in many text books under the title 'central limit theorem'.

In the example above, the following two-tailed null hypothesis may be proposed:

HO: 'X' and 'Y' sample data is drawn from the same statistical population, or have no significant difference.

The first method: using the Wilcoxon T-distribution

Access is required to a table of critical values of the Wilcoxon T-distribution found in most text books. In the example case printed above the table would be entered with n=9 (labelled as 'Ranked Obs.(n)' in the output). At the 0.05 level of significance the T-critical value may be found to be 5 for a two-tailed test. The calculated value of T can be seen to be 18 in the output. In most other test circumstances the null hypothesis is rejected if the calculated test statistic is found to exceed the critical value at a given level of confidence. In this case the null hypothesis is retained unless the T-statistic is less than or equal to the critical value.

Note that if one-tail tests are being analysed using this method, then the following mechanisms apply:

  1. If HO: Sample 1 measurements < or = Sample 2 measurements, then reject HO if 'Sum of Neg. Ranks' < or = T(0.05)(1-tail)(d.f.=9).
  2. If HO: Sample 1 measurements > or = Sample 2 measurements, then reject HO if 'Sum of Pos. Ranks' < or = T(0.05)(1-tail)(d.f.=9).

In the output the 'Sum of Neg. Ranks' and the 'Sum of Pos. Ranks' are often also referred to as 'T-' and 'T+' respectively.

The second method: using the normal approximation

As the t-distribution (with v=infinity) is identical to the normal distribution, the z-critical value is equal to the t-critical value. As the value of z is computed (i.e., -0.5331) it can then be compared with the two-tailed t-critical value at a given level of significance in order to determine the validity of the null hypothesis.

At the 0.05 level of significance the t-critical value with v=infinity is 1.9600. As this is not exceeded by the computed value of z this also indicates that the null hypothesis should be retained.



Back to Main Document