Chi-square (x²) goodness of fit analysis for normal distribution
Goodness of fit (GOF) analysis is concerned with testing a hypothesis that sampled data may be randomly drawn from a theoretical distribution or that a random variable has a specific theoretical probability distribution. The data for this test consist of N independent observations of a random variable which are grouped into c classes. The measurement scale must be at least nominal. If the number of observations in any one cell is very low, the observations may be combined into adjacent cells.
Script operation.
This script allows the user to interact with the computer to group raw data for the analysis. It requires the user to respond to a series of requesters:
Otherwise, the script operates as per normal. Note that the first cell in the column can be a label which is used by the script on output.
Click here for information about general script usage.
Here is an example (note only the "DO_value" column is used):
Dissolved oxygen values for 50 sampled lakes. Lake DO_value 1 5.1 2 5.6 3 5.3 4 5.7 5 5.8 6 6.4 7 4.3 8 5.9 9 5.4 10 4.7 11 5.6 12 6.8 13 6.9 14 4.8 15 5.6 16 6.4 17 5.9 18 6 19 5.5 20 5.4 21 4.4 22 5.1 23 5.6 24 5.8 25 5.7 26 4.9 27 6.6 28 5.7 29 5.4 30 5.9 31 5.6 32 6.7 33 5.4 34 4.8 35 6.4 36 5.8 37 5.3 38 5.7 39 6.3 40 4.5 41 5.6 42 6.2 43 4.2 44 5.2 45 5.8 46 6.1 47 5.1 48 5.9 49 5.5 50 4.7
Here is the output based on requesting a starting l.c.b of 4 and accepting the other defaults:
Chi-Sq Goodness of Fit of Normal Frequency Distribution DO_value Mean Standard Proportion Expected Observed l.c.b u.c.b Deviation Score Within Frequency Frequency 6.57 0.0605 3.02 4 6.14 6.57 0.99 1.5509 0.1289 6.44 5 5.71 6.14 0.56 0.8804 0.2275 11.38 10 5.29 5.71 0.13 0.2099 0.2606 13.03 18 4.86 5.29 -0.29 -0.4606 0.1935 9.68 5 4.43 4.86 -0.72 -1.131 0.0932 4.66 5 4 4.43 -1.15 -1.8015 0.0358 1.79 3 Chi-Square: 5.8042 d.f.: 6 P(CHI<=chi): 0.554527 Chi-Critical (95%): 12.5916 Chi-Critical (99%): 16.812Interpretation
The calculated Chi-Sq is very sensitive to the classes being generated. The objective should be to minimise the Chi-Sq statistic as much as possible. This may require additional adjustments to the class boundaries etc.
Look at the operation of the other Chi-Sq Goodness of Fit script.