THE SONS OF GODAs briefly discussed elsewhere, perhaps the most fascinating andmysterious observation in the timeline data (for those of us mystically inclined) isthe fact that six of seven Sons of God (Zoroaster, Siddharta Gautama,Vardhamana, Confucius, Jesus Christ) were born within a seven century span oftime from (approximately) 700 B.C. to 0 A.D. Only Mohammed’s birth lies outsidethis time period. A priori, it is difficult to imagine why such births would not bespaced throughout the 5,000 years of recorded history and that the pattern wouldbe essentially random. To postulate otherwise, assumes something ‘special’ orunusual about the period 700 B.C. to 0 A.D. that never ocurred earlier in historyand has never been duplicated since. In effect, such an hypothesis would placedthe label ‘unique’ upon this seven century period of time.The clustering of the Sons of God within this time frame has been noted inmany history books, but it is given only passing mention. Hypotheses wouldseem to lurch off into the realm of the occult and what self respecting historianwould wish to enter that territory? What circumstances, environmental orotherwise, might foster the appearance of what I shall term ‘deity beings’? Myspeculation upon such matters will not be presented here. Rather, I have a moremodest goal for this document.I shall subject the data to a fairly rigorous statistical analysis, if only todemonstrate in quantitative terms how improbable is this clustering of deitybeings within the period 700 B.C. to 0 A.D. The much maligned and often hatedstatistical tools have real tactical value. They serve to confirm (or refute) ourintial intuition and observations that are based upon a quick visual scan of thepattern in the data. A quick scan, of the timeline may be misleading in the sensethat subconsciously I might ‘wish’ to see this clustering, particularly when mystudy has primed me to look for it and I ‘want it to exist’. The serious question iswhether or not the apparent clustering meets scientific criteria of extremeunusualness or improbability. My eyes, and yours as well, may be perceiving atrend that is of no significance.I will go slowly with full explanation. In order to provide you, dear reader,with the motivation to continue, I shall tell you the conclusion in advance. From astrict statistical point of view (which is the only scientific methodology that isacceptable), the clustering of six Sons of God within the first seven centuries B.C.is extremely improbable, so much so as to deserve the label ‘unique’. It cannotbe explained within the context of the usual (i.e. ‘normal’) distribution of eventspresumed to occur at random. This analysis is designed for the skeptics and is apreliminary exercise that clears the way for an attempt at a mechanisticexplanation for this extraordinary clustering of deity being births.1These data are very simple from a mathematical point of view but thestatistical tools that will be employed to analyze them while not advancedcomputationally, are complex and subtle conceptually. These data lendthemselves to cross tabulation in a simple 2 x 2 table. In statistics theorganization of a body of data determines the tools that may be appropriatelyapplied in the analysis. Cross tablulated data are very common, a battery ofstatistical tools is at hand; indeed entire books have been written on the analysisof data in this form.Case 1 will present two closely related tables. Table 1A tabulates thepresence and absence of Sons of God over the 4,000 years of recorded historydocumented in the timeline. Table 1B presents such observations for the 5,000years of recorded history to date; i.e. from c. 3,000 B.C. to c. 2000 A.D.Case 1: Presence and Absence of Sons of God Table 1A: Sons of God - Presence/Absence AnalysisSGbySGab700 BC to 0 AD66943000 BC to 900 AD73993Table 1B: Sons of God - Presence/Absence AnalysisSGbySGab700 BC to 0 AD66943000BC - 2000AD74993where:SGby = Sons of God birth years; andSGab = Non-birth years for Sons of God; i.e. years within the designatedtime interval in which births of a Son of God did not occur; i.e. were absent.2Several statistics for these table were computed using Statgraphics 5.0 andS-Plus for Windows, both state of the art statistical software packages. Allstatistics supported the conclusion of extreme improbability. In this and allsucceeding tables in this analysis, there is one degree of freedom, i.e. only onedimension for data independence. For Table 1A, the contigency coefficient C =0.04619. This is a measure of association between two attributes and it is verylow here, 4.6% to be precise, and of no significance in either the strict statisticalsense or the everyday meaning of the word. The pattern of births for the Sons ofGods in the time interval 700 B.C. to 0 A.D. is completely unlike that over theentire 4,000 year span of the timeline. For Table 1B where all of recorded historyis examined, C = 0.049 and the conclusion is the same. Strict significance levelsfor cross tabulated data are usually assessed from the Chi-square statistic (X2). For Table 1A, X2 = 10.05 and the significance level (p) is 1.52 x 10-3. A variation incomputation of the X2 known as the Yates correction takes into account extremelysmall sample sizes, a situation that we have here for SG2by and SGab. CY = 7.73with p = 5.43 x 10-3 for Table 1A. For Table 1B, X2 = 13.88 with p = 1.95 x 10-3 andX2-3Y = 10.91 with p = 9.59 x 10. Fisher’s Exact Test assesses the probability thatthe data in the four cells of the table should be exactly as they are assuming thetotal number of observations for the entire table is fixed (held constant). Furtherassumptions are that the classes for categorizing are mutually exclusive (theycertainly are here - either a deity being is born in a given year or is not); and thatthe underlying distribution of the data while exhibiting a peak and two ‘tails’ neednot be taken to be ‘normal’ (i.e. is not the familiar bell shaped curve). For ourpurposes, this a particularly powerful statistical tool because no assumptions areallowed about the underlying distribution that characterizes the data. Such testsare termed ‘nonparametric’. For theoretical reasons, it is not applicable to Case 1but will be applied to Cases 2 and 3. McNemar’s Test for the equality of proportions (X2M) is of particular interesthere because it is also a distribution free test: i.e. non parametric. However, therow dichotomies must also be mutually independent. That requirement is also methere. Whether we choose a time interval over the range of time represented bythe timeline, or that represented by all of recorded history, the decisions are notdependent upon one another. The Yates correction for continuity is employed inthe computation of this Chi-Square. For Table 1A, X2M = 672 and p = 0. The nullhypothesis being tested is the symmetry of probabilities associated with each rowand column of the table: i.e. p2ij = pji for each combination of i and j. This huge XMtells us that this hypothesis of probabilities cannot be accepted: there is nosymmetry between the numbers in column 1 when compared to column 2 orbetween row 1 when compared to row 2. The later conclusion is importantbecause it tells us that the proportion of deity beings born in the first seven3centuries B.C. is strikingly dissimilar to that born over the entire time spanrepresented by the time line. The X2M for Table 1B is the same and, notsurprisingly, we reach the same conclusion when all of recorded history isconsidered.Case 2: Sons of God-Model I for Expected Births Routine formula based upon theoretical statistical models allow for the easycomputation of expected values of any observation in data that are crosstabulated. The expected values caculated by this procedure are rounded off tothe nearest whole integer; certainly fractional people or deity beings make nosense and could not exist.Table 2A: Model I for Expected Births-Timeline onlySGobSGex700 BC to 0 AD623000BC - 900AD710Table 2B: Model I for Expected Births-all recorded historySGobSGex700 BC to 0 AD623000BC -2000AD711where:SGob = observed births of Sons of God (= SGby in Case 1)SGex = expected births of Sons of God for time interval as computed by thestandardized statistical approach.4In Case 2, we are assessing the degree to which the expected numbersdiffer from the observed. This is a very direct approach. Unlike Case 1, thequestion of interest may be asked directly. Are there more Sons of God in thetime interval 700 B.C. to 0 A.D. than would be expected assuming their existenceto be spread evenly throughout recorded history?For Table 2A, which tabulates the data for the interval represented by thetimeline, C = 0.30. X2 = 2.49 with r = 0.114 and X2y = 1.32 with p = 0.250. For Table2B, which tabulates the data for all of recorded history, C = 0.32. X2 = 2.89 with p= 0.089 and X2Y = 1.63 with p = .202. While the Chi-square values are not nearly ashigh as in Case 1 and the corresponding p values are greater, there is still a firmcase for the nonidentity of the two rows of data and the extreme improbability ofthe observed births of Sons of God. For Table 2A, Fisher’s Exact Test gives a p =0.125 for the one tailed test and p = 0.202 for the two tailed test.. For Table 2B,Fisher’s Exact Test gives p = 0.100 for the one tailed test and p = 0.202 for thetwo tailed test. The conclusion is that the distribution of Sons of God birth yearsas tabulated is very unusual; the probability that it would fall out as observed is,at most, 20%. For Table 2A, X2M = 1.778 with p = 0.18: there is no symmetrybetween the probabilities associated with each row and colum of the table. Thesame conclusion is reached for Table 2B where X2M is the same. There is nosimilarity in the proportion of ‘sons of god’ born to those expected in the firstseven centuries B.C. when compared to that ratio tabulated for either all recordedhistory to 900 A.D. or all recorded history to 2,000 A.D.Case 3: Sons of God-Model II for Expected BirthsIn the second model for expected births, the expected number of Sons ofGod per time interval is computed from an hypothesis of total randomness; i.e.such births are equally probable in any year.Table 3: Model II of Sons of God: Expected BirthsSGobSGex700B.C. to 0 A.D.613000BC - 900AD775The expected numbers calculated under this hypothesis are the same if theentire span of recorded history c.3000 B.C. to c. 2000 A.D.is considered andtherefore this one table encompasses both scenarios. For this model, C = 0.33, X2= 2.53 and p = 0.112 while X2Y = 1.24 with a p = 0.266. Clearly, there is a verysignificant difference between the observed number of deity being births in eachtime interval and the number one might expect asssuming complete randomnessof such events. Fisher’s Exact Test yields p = 0.133 for the one tailed test and p =0.174 for the two tailed test thereby confirming the highly improbable distributionof observations as seen in Table 3. McNemar’s X2M= 3.125 and p = 0.0771 and thenull hypothesis is once again not confirmed although not at the striking level ofthe analyses above. Perhaps the ad hoc assumption of randomness whichbypasses the usual statistical procedures is not valid. The relative low p valuehere illustrates the distorting effects of the small sample size which even theYates correction cannot compensate for adequately.Each of the three tabulations of data and their respectiveanalyses provide the same conclusion. These statistical toolsconfirm in a rigorous manner what our intuition suspected uponcasually glancing at the timeline. The number of Sons of Godclustered in the interval 700 B.C. to 0 A.D. is truly extraordinary,exceptional and improbable!ReferencesBradley, J.V. 1968. Distribution Free Statistical Tests. Englewood Clifffs, N.J.: Prentice Hall.Siegel, S. 1956. Nonparametric Statistics for the Behavioral Sciences. New York: McGraw-Hill.Simpson, G.G., A. Row and R.C. Lewontin. 1960 rev.ed. Quantitative Zoology. New York:Harcourt, Brace & World.Sokal, R.R. and F.J. Rohlf. 1980 2nd ed. Biometry. San Francisco: W.H. Freeman.SGC. 1991. Statgraphics Version 5.0. Reference Manual. Rockville, Md: STSC, Inc.6StatSci. 1993. Reference Manual Vol.1. Seattle: MathSoft Inc.StatSci. 1993 User’s Manual Vol.2. Seattle: MathSoft Inc.Appendix A; Data Anaylysis from S-Plus for WindowsFor those of you who are curious and technically inclined, here is the printout from S-Plus for Windows for much of the analysisdiscussed above. Notice that some tests were not performed because the data set did not fit required constraints.S-PLUS : Copyright (c) 1988, 1993 Statistical Sciences, Inc.S : Copyright AT&T.Version 3.1 Release 3 for MS Windows 3.1 : 1993Working data will be in _DataTable 1A: Sons of God - Presence/Absence Analysis (10,000 B.C.-900 A.D.)> snsgd1amat <-rbind(c(6,694),c(7,3993))> snsgd1amat [,1] [,2][1,] 6 694[2,] 7 3993> chisq.test(snsgd1amat) Pearson's chi-square test with Yates' continuity correctiondata: snsgd1amatX-squared = 7.7291, df = 1, p-value = 0.0054Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd1amat)> chisq.test(snsgd1amat)$p.value[1] 0.005433664Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd1amat)> fisher.test(snsgd1amat)Error in fisher.test(snsgd1amat): Sum of counts in table > 200Dumped> fisher.test(snsgd1amat)$p.valueError in fisher.test(snsgd1amat): Sum of counts in table > 200Dumped> mcnemar.test(snsgd1amat) McNemar's chi-square test with continuity correctiondata: snsgd1amatMcNemar's chi-square = 671.321, df = 1, p-value = 0> mcnemar.test(snsgd1amat)$p.value[1] 07Table 1B: Sons of God - Presence/Absence Analysis 10,000 B.C. - 2,000 A.D.> snsgd1b.mat <-rbind (c(6,694),c(7,4993))> snsgd1b.mat [,1] [,2][1,] 6 694[2,] 7 4993> chisq.test(snsgd1b.mat) Pearson's chi-square test with Yates' continuity correctiondata: snsgd1b.matX-squared = 10.9054, df = 1, p-value = 0.001Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd1b.mat)> chisq.test(snsgd1b.mat)$p.value[1] 0.0009588607Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd1b.mat) > fisher.test(snsgd1b.mat)Error in fisher.test(snsgd1b.mat): Sum of counts in table > 200Dumped> fisher.test(snsgd1b.mat)$p.valueError in fisher.test(snsgd1b.mat): Sum of counts in table > 200Dumped> mcnemar.test(snsgd1b.mat) McNemar's chi-square test with continuity correctiondata: snsgd1b.matMcNemar's chi-square = 671.321, df = 1, p-value = 0> mcnemar.test(snsgd1b.mat)$p.value[1] 0>Table 2A: Model I for Expected Births - 10,000 B.C. - 900 A.D.> snsgd2a.mat <-rbind(c(6,2),c(7,10))> snsgd2a.mat [,1] [,2][1,] 6 2[2,] 7 10> chisq.test(snsgd2a.mat) Pearson's chi-square test with Yates' continuity correctiondata: snsgd2a.matX-squared = 1.3224, df = 1, p-value = 0.2502Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd2a.mat)> chisq.test(snsgd2a.mat)$p.value [1] 0.25016Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd2a.mat)8> fisher.test(snsgd2a.mat) Fisher's exact testdata: snsgd2a.matp-value = 0.2016alternative hypothesis: two.sided> fisher.test(snsgd2a.mat)$p.value[1] 0.2015563> mcnemar.test(snsgd2a.mat) McNemar's chi-square test with continuity correctiondata: snsgd2a.matMcNemar's chi-square = 1.7778, df = 1, p-value = 0.1824> mcnemar.test(snsgd2a.mat)$p.value[1] 0.1824224>Table 2B: Model I for Expected Births - 10,000 B.C. -2,000 A.D.> snsgd2b.mat <-rbind(c(6,2),c(7,11))> snsgd2b.mat [,1] [,2][1,] 6 2[2,] 7 11> chisq.test(snsgd2b.mat) Pearson's chi-square test with Yates' continuity correctiondata: snsgd2b.matX-squared = 1.625, df = 1, p-value = 0.2024Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd2b.mat)> chisq.test(snsgd2b.mat)$p.value[1] 0.202396Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd2b.mat)> fisher.test(snsgd2b.mat) Fisher's exact testdata: snsgd2b.matp-value = 0.2016alternative hypothesis: two.sided> fisher.test(snsgd2b.mat)$p.value[1] 0.2015558> mcnemar.test(snsgd2b.mat) McNemar's chi-square test with continuity correctiondata: snsgd2b.matMcNemar's chi-square = 1.7778, df = 1, p-value = 0.1824> mcnemar.test(snsgd2b.mat)$p.value[1] 0.18242249Table3: Model II Sons of God - Expected Births> snsgd3.mat <-rbind(c(6,1),c(7,7))> snsgd3.mat [,1] [,2][1,] 6 1[2,] 7 7> chisq.test(snsgd3.mat) Pearson's chi-square test with Yates' continuity correctiondata: snsgd3.matX-squared = 1.2368, df = 1, p-value = 0.2661Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd3.mat)> chisq.test(snsgd3.mat)$p.value[1] 0.2660928Warning messages: Expected counts < 5. Chi-squared approximation may not be appropriate. in: chisq.test(snsgd3.mat)>> fisher.test(snsgd3.mat) Fisher's exact testdata: snsgd3.matp-value = 0.1736alternative hypothesis: two.sided> fisher.test(snsgd3.mat)$p.value[1] 0.1735811>> mcnemar.test(snsgd3.mat) McNemar's chi-square test with continuity correctiondata: snsgd3.matMcNemar's chi-square = 3.125, df = 1, p-value = 0.0771> mcnemar.test(snsgd3.mat)$p.value[1] 0.07709987>These data do not meet the requirements for computation of any of these threecorrelation coefficients (Pearson, Kendall, Spearman) according to S-Plus algorithms.> sgab1a.hs <-scan()1: 694.0 3993.03:> cor.test(sgby1a.hs,sgab1a.hs)Error in cor.test(sgby1a.hs, sgab1a.hs): x and y should effectively be longer than 2Dumped> cor.test(sgby1a.hs,sgab1a.hs,alt="two.side","p")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "t..: x and y should effectively be longer than 2Dumped> cor.test(sgby1a.hs,sgab1a.hs,alt="two.sided,"p")Syntax error: name ("p") used illegally at this point:cor.test(sgby1a.hs,sgab1a.hs,alt="two.sided,"p> cor.test(sgby1a.hs,sgab1a.hs,alt="g",method="p")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "g..: x and y should effectively be longer than 2Dumped10> cor.test(sgby1a.hs,sgab1a.hs,alt="1",method="k")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "1..: x and y should effectively be longer than 2Dumped> cor.test(sgby1a.hs,sgab1a.hs,alt="1",method="s")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "1..: x and y should effectively be longer than 2Dumped> cor.test(sgby1a.hs,sgab1a.hs,alt="g",method="p")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "g..: x and y should effectively be longer than 2Dumped> cor.test(sgby1a.hs,sgab1a.hs,alt="g",method="k")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "g..: x and y should effectively be longer than 2Dumped> cor.test(sgby1a.hs,sgab1a.hs,alt="g",method="s")Error in cor.test(sgby1a.hs, sgab1a.hs, alt = "g..: x and y should effectively be longer than 211