home *** CD-ROM | disk | FTP | other *** search
-
- USENET READERSHIP SUMMARY REPORT for Jan 95
- ***Note: there have been several months without a readership report.
- The reason for the hiatus is that a lot of sites have been submitting
- forged data to make their favorite newsgroups look widely read. Last
- month's report would have showed "alt.activism" to be the most popular
- newsgroup on the network. Nearly all of the forgeries are coming
- from Europe.
-
- I have analyzed all data submitted over the last few years and from
- this analysis I have programmed a statistical "forged data rejector".
- This report for January 1995 is the first to exclude all forged data;
- as I look back through the historical reports, some sites started
- small-scale doctoring of the data in early 1994, but the practice
- did not become rampant until summer 1994.
-
- Forgery detection is of course a cat-and-mouse game, and if these
- people are serious about disrupting the numbers, they will find a way
- to circumvent my forgery detector, and sooner or later the reports
- will degrade again.
-
- --------------------------------------------------------------------------
- This is the first article in a monthly posting series from the Network
- Measurement Project at the DEC Network Systems Laboratory in Palo Alto,
- California.
-
- This survey is based on a sample of data taken from various USENET sites.
- At the end of this message there is a short explanation of the measurement
- techniques and the meaning of the various statistics. The messages that
- follow this one show survey data sorted by various criteria.
-
- The newsgroup volume and article counts that I post are often significantly
- different from the ones posted by Rick Adams, because he includes the size of
- a crossposted article in every group to which it is posted, whereas I charge
- that size only to the first-named group.
-
- The complete set of readership data (of which this is a summary) is posted
- in news.lists. The software that will let your site participate in the
- survey is in comp.sources.d and news.admin
-
- Brian Reid
- reid@pa.dec.com
-
-
- OVERALL SUMMARY:
- This Estimated
- Sample for entire net
- Sites: 453 260000
- Fraction reporting: 0.17% 100%
- Users with accounts: 190664 47579000
- Netreaders: 66123 16500000
-
- Average readers per site: 146
- Percent of users who are netreaders: 34.68%
- Average traffic per day (megabytes): 242.204
- Average traffic per day (messages): 84719
- Traffic measurement interval: last 28 days
- Readership measurement interval: last 75 days
- Sites used to measure propagation: 453
-
-
- Valid data received from these sites:
-
- 6sigma(5) actew.oz.au(811) adolfoien.vgs.no(2) aedi.insa-lyon.fr(510)
- airs.com(8) alanya.isar.muc.de(10) alchemy(371) alex(0) alfred(4)
- alsys.com(125) alsys.de(15) anakena.dcc.uchile.cl(7)
- angelo.healthchex.com(24) angus.mystery.com(35) animal.inescn.pt(247)
- anorad.com(119) apricot.co.uk(80) arakis.fdn.org(8) atfs0(174)
- awful(13) aztec(140) badlands.nodak.edu(8838)
- barnard.manawatu.planet.co.nz(5) bat710.univ-lyon1.fr(508)
- bcstec.ca.boeing.com(863) beauty(19) belvedere(7) bgsuvax(1110)
- bigwheel(200) blackice(1) blkhole(14) bohemia(85)
- bohr.phys.ksu.edu(286) boy(6) bsuvc.bsu.edu(11358)
- btoy1.rochester.ny.us(16) cabezon(201) caipfs.rutgers.edu(21)
- cam-orl.co.uk(113) caribou.msfc.nasa.gov(11) carver.wa.com(69) ccs3(1)
- cello(539) centre.univ-orleans.fr(191) cerritos.edu(987) cfctech(36)
- cgate.sait.ab.ca(580) chekov(6) chemeng.ed.ac.uk(80) cheops(231)
- cherry(33) chiark(7) chinaca(21) chinacat(20) chuck.sycraft.com(3)
- ci.org(173) cigna(708) cis(64) cleo(3)
- clpd-newsserver.clpd.kodak.com(744) clpgh.org(418) cnplss5(99)
- codewks(196) cognos(291) colossus(1233) coral(39)
- cpvax.cpses.tu.com(63) cradac(0) cronus(46)
- csdvax.csd.unsw.edu.au(2188) cspyr0(79) csustan.csustan.edu(27)
- cub.kscorp.com(13) cutler.com(10) cuugnet(832) cvedg(3) cwis(347)
- dante(21) dante.migsol.com(21) datani.dk(17) dciem(196) desc.dla.mil(7)
- devon(5) digi(1280) dimacs.rutgers.edu(587) discg2.disc.dla.mil(10)
- discg3.disc.dla.mil(1008) discg4.disc.dla.mil(77) disunms(1093)
- disuns2(972) dogface(1) dorm.rutgers.edu(249) dove(146) dplace(0)
- drager.com(275) dragon.com(42) drd(39) drum.msfc.nasa.gov(26)
- dsacg2.dsac.dla.mil(5) dsbc.icl.co.uk(135) dsinet(16) duke(550)
- dumbcat.sf.ca.us(11) dutiws.twi.tudelft.nl(424) earlgrey.exnet.com(1)
- ees1a0.engr.ccny.cuny.edu(9) egreen(733) eis.calstate.edu(6492)
- elements.rpal.rockwell.com(83) elmo(21) elsie(5) ember(3) eonwe(87)
- eos(174) eram.esi.com.au(72) ernest(26) ernie(15) esatst(19)
- esslemont.manawatu.planet.co.nz(3) europa(184) europa.com(28)
- fasterix.frmug.fr.net(8) fdmetd(7) fermat(274) filomen(0) flab(179)
- flatlin(13) franz.com(66) freedm(10) freenet-news(34438)
- gauss.rutgers.edu(216) gcc.edu(884) geac(234) geovax.ed.ac.uk(226)
- getank(52) giga(265) gistdev(57) gmdtub(251) golem(2) goofy(319)
- gordius(58) gouldnl(52) gozer(9) grafex(26) grian(20) gtisqr(17)
- gypsum.berkeley.edu(98) halcyon(7248) hammer.msfc.nasa.gov(24)
- hamnet(25) harrnl(23) hawkmoon(0) hccompare.com(726) hhcs.gov.au(5)
- hhvo.sjoe.mil.no(11) hilbert.rutgers.edu(135) hiram.edu(878) hodgson(2)
- hornet(1) hp400p(34) humming(98) iamk4515(44) iat.holonet.net(6350)
- iclnet93.iclnet.org(28) ics.uci.edu(422) iecc(10) iesd.auc.dk(405)
- ifens01.insa-lyon.fr(20) ifhamy.insa-lyon.fr(233)
- ifhpserv.insa-lyon.fr(69) ifi.uio.no(3353) iitmax(1386) imagelan.com(5)
- imladris(30) imperium(12) impreza(134) inescca.inescc.pt(37)
- infodyn(20) infopro.infopro.com(10) infotax(1) intrepid(18)
- investor(10) iris.claremont.edu(7) iris.mbvlab.wpafb.af.mil(143)
- islabs(16) isys-hh(147) ixi(4) jabba.ess.harris.com(128) james(79)
- jaws(225) jerrwood(1) jfwhome.funhouse.com(17) johnny5(2) jove(43)
- jtmiii.uucp(2) julian.uwo.ca(4150) jupiter(68) kaepk.ericsson.se(69)
- kaepk1.ericsson.se(81) kaepk3.ericsson.se(46) kaepk4.ericsson.se(91)
- kala(11) kalle.impab.se(2) kb2ear.overleaf.com(56)
- keltia.frmug.fr.net(50) khijol(43) kksys(104) kofax(69) krason(11)
- lakes(250) latour(8) ledger.co.forsyth.nc.us(142) lkbreth(50)
- loretta(32) lpi(59) m2xenix.psg.com(173) macdona(103) macdonal(242)
- magic.capsogeti.fr(130) mahavir(1) manger.modeld.no(1) mantis.co.uk(21)
- marriott.clark.net(260) mars(257) martex(10) math.berkeley.edu(5)
- math.rutgers.edu(465) mathstat(44) matrox(489) matrx(21) maya(50)
- mcmi(40) mcsiad(3) mdtvus.com(26) metasoft(34) mica.berkeley.edu(72)
- miclon(64) midas(39) missing(799) mnemosyne.cs.du.edu(210) modus(38)
- mole.hawkesbay.planet.co.nz(8) monygmc(21) monymsys(6) moonbase(32)
- mr-pibb(779) mtdiablo(19) mtroyal.ab.ca(733) mts(13) muselab(734)
- nad.com(285) nanovx(8) nasim(86) nate(2) ncoast(676) ncrlisl(134)
- neodata(1028) netagw(5) netline-fddi(9) news-server.aa.cad.com(287)
- news-server.aa.cad.slb.com(275) news.cis.ohio-state.edu(3638)
- news.ilx.com(181) news.loria.fr(527) newton.isa.de(58) nezsdc(4)
- nicmad(355) nj8j(9) nmc(1) nmrdc1(8) nocusuhs(17)
- nosun.west.sun.com(35) noweh.com(3) nri-e(67) nrlvx1.nrl.navy.mil(325)
- nrlvx2.nrl.navy.mil(321) nttta(56) numachi(16)
- nyx10.nyx10.cs.du.edu(18123) obdient(29) ocean(90) oslonett.no(4077)
- oucsace.cs.ohiou.edu(779) ovation(252) overload(2540)
- pasadena-dc.bofa.com(24) pbhya(105) pbhyb(239) pbhyc(279) pbhyd(132)
- pbhye(186) pbhyg(256) pentagon-ai(94) phage(521) pi19(103) piaggio(50)
- picasso(133) platon.transport.tih.no(3) plxsun(221) pmafire(180)
- practic.practic.com(8) presby.edu(254) primerd(102) prism1(31) pta(177)
- ptsfa(100) pute.cmhnet.org(11) pylon(9) pyramid(43) pyratl(41)
- qiclab.scn.rain.com(28) quando(214) qucdnee.ee.queensu.ca(37)
- qucdntri.ee.queensu.ca(25) quest(313) questrel(21) railnet(12)
- raindrop(6) raybed2(1172) rayleigh(103) rci(1068) rebel(4) redpoll(3)
- redshirt.cc.rochester.edu(24) residents(8) resonex(36) rhi.hi.is(4334)
- robohack(95) robtoy.manawatu.planet.co.nz(4) rochester(230) rosebud(2)
- rosedale(0) roselin(1051) rsd0(26) rtxirl.rtxirl.ie(38) ruacad(636)
- rubb.rz.ruhr-uni-bochum.de(20) rucs(18) rucs2(166) rufus(536)
- rulcde.leidenuniv.nl(14) rulcvx(0) rutcor.rutgers.edu(170)
- rutgers.rutgers.edu(60) sactoh0.sac.ca.us(107) sadtler(31) saturn(52)
- sauron.msfc.nasa.gov(20) sausage.manawatu.planet.co.nz(3)
- sausage.taranaki.planet.co.nz(6) scarboro(280)
- scfe.chinalake.navy.mil(577) scicom(53) scow(466) scrash(8) sdl(85)
- seanews(389) seer(41) sgfb(127) shiva.com(254) si.sintef.no(246)
- sis.stockell.com(20) skyking(16) slcl.lib.mo.us(69)
- sol.ctr.columbia.edu(278) sooner.palo-alto.ca.us(2) sparky(5)
- spatial.com(93) spock.retix.com(78) spunky.redbrick.com(129)
- srchtec(23) stat(39) stephsf.com(20) stephsf.stephsf.com(20)
- student(511) summit(34) sun19(37) sunburn.stanford.edu(227) suned1(731)
- sycraft.com(6) symbiosis.ahp.com(347) synercom(23) tachyon.com(11)
- tardis(145) tarzan(219) taylor.manawatu.planet.co.nz(3) tct(13)
- tellab5(1749) tembel(11) teslab(28) theseas(866) tijger.fys.ruu.nl(552)
- til(18) tintin.csl.sni.be(0) titan(414) tol-ed.com(43) torrie(13)
- totaltec.com(133) tower(1) tower.nullnet.fi(31) tram(4)
- troi.cc.rochester.edu(715) ttsi(63) tukki(2099) turtle.fisher.com(253)
- tut.msstate.edu(5415) twg(19) ubaclu.unibas.ch(1279)
- ucbeh.san.uc.edu(3666) uhura.cc.rochester.edu(4258) ukma(722)
- umd5.umd.edu(3713) uniwa(2099) unvax.union.edu(2195) ursa(923)
- urz.unibas.ch(1237) utdoe(21) utgpu(726) uunet(290)
- valinor.mythical.com(244) valnet(117) vanlib.fvrl.org(28)
- vela.acs.oakland.edu(8148) venus(162) vicuna(24) visicom(102)
- visual(34) vms.ocom.okstate.edu(197) voodoo.ca.boeing.com(94)
- warwick(14802) water.berkeley.edu(130) wb8apd(11) wcc(7)
- weaver.berkeley.edu(170) webworm.berkeley.edu(838) weitek.com(122)
- wesel(37) wetware(7) wheaton.wheaton.edu(17) whscdp.whs.edu(462)
- widow.berkeley.edu(260) wizvax(177) wofford.edu(687)
- wolf.berkeley.edu(173) wshb(43) wsrcc.com(5) wvml.jeslacs.bc.ca(25)
- wvus(0) xenitec(42) xopuk(0) xtree(6) yage(12) zorch(11)
-
- ------------------------------------------------------------------------------
- EXPLANATION OF THE MEASUREMENTS AND STATISTICS
-
- Survey data is taken by having one person at each site run a program called
- "arbitron", which looks at the news or notes files and determines the
- newsgroups that the user has read within a recent interval. To "read" a
- newsgroup means to have been presented with the opportunity to look at at
- least one message in it. Going through a newsgroup with the "n" key counts
- as reading it. For a news site, "user X reads group Y" means that user X's
- .newsrc file has marked at least one unexpired message in Y. If there is no
- traffic in a newsgroup for the measurement period, then the survey will show
- that nobody reads the group. For a notes site, "user X reads group Y" means
- that user X has been in the notesfile with the sequencer in the last 14 days.
- The "14 days" interval for notesfiles corresponds to "unexpired" for news.
-
- The "arbitron" program is periodically posted to comp.sources.d, or is
- available from me (decwrl!reid). The notesfiles version of the program should
- be available through standard notesfiles software distribution channels as
- well.
-
- SITES SURVEYED IN THIS SAMPLE
-
- "This Sample" means the set of sites that have sent in an arbitron report
- within the past "Readership measurement interval" days. In every case the
- most recent report from each site is used. At the moment, some of the
- readership reports are several months old. In future postings those reports
- will have expired and will not be included.
-
- The number in parentheses after the site name is the number of users that the
- site reported. A value of (0) usually means that the software has been
- configured to use the wrong technique for counting users at that site; a
- report showing 0 users but 6 readers of rec.humor.funny is statistically
- meaningful.
-
- One might argue that the sample is self-selected, and thereby be biased. It
- does in fact have a certain self-selection factor in it, because we only get
- data from sites at which someone participates in the survey. However, we do
- not require the participation of every user at a site, only one user. The
- survey program returns data for every user on the system on which it was run.
- Since there are an average of 30 people per site reading news, there is a
- certain amount of randomness introduced that way. Of course, the sample is
- biased in favor of large sites (they are more likely to have a user willing
- to run the survey program) and software-development-oriented sites (more
- likely to have a user *able* to run the survey program).
-
- NETWORK SIZE
-
- I determine the network size by looking at the set of sites that are
- mentioned in the Path lines of news articles arriving at decwrl. This number
- is consistently higher than the number of sites that posted a message (as
- measured and posted from uunet) because it includes passive sites that are
- on the paths between posting sites and decwrl. Each month I store the names
- of the hosts that are named that month, and for this report I used the past
- 13 months worth of data.
-
- There are 257417 different sites in the Path lines of articles that
- arrived at decwrl in the last 13 months. There are 19296
- different sites in the comp.mail.maps data, but comp.mail.maps tends to
- include only one or two machines for each organization, leaving the rest
- unmentioned. Also a large number of sites participate in USENET without
- participating in UUCP.
-
- I believe that 260000 is the best estimate for the size of USENET.
- Because it is actually a measurement of the number of sites that have posted
- a message or that are on the path to a site that has posted a message, it
- will be slightly smaller than the number of sites that actually read netnews.
- Any site that believes it is not being counted can just ensure that it posts
- at least one message a year, so that it will be counted.
-
-
- NUMBER OF USERS
-
- The number of users at each site is determined in a site-specific fashion.
- Sometimes it is done by counting the number of user accounts that have
- shells and login directories. Sometimes it is done by counting the number of
- people who have logged in to the machine in some interval. Sometimes other
- techniques are used. This number is probably not very accurate--certainly
- not more accurate than to within a factor of two.
-
-
- ESTIMATED TOTAL NUMBER OF PEOPLE WHO READ THIS GROUP, WORLDWIDE
-
- There are two sources of error in this number. The number is computed by
- multiplying the number of people in the sample who actually read the group by
- the ratio of estimated network size to sample size. The estimated total can
- therefore be biased by errors in the network size estimate (see above) and
- also by errors in the determination of whether or not someone reads a group.
- Assuming that "reading a group" is roughly the same as "thumbing through a
- magazine", in that you don't necessarily have to read anything, but you have
- to browse through it and see what is there, then the measurement error will
- come primarily from inability to locate .newsrc files, which can either be
- protected or moved out of root directories. There is no way of measuring the
- effect on the measurements from unlocated .newsrc files, but it is not likely
- to be more than a few percent of the total news readers.
-
- PROPAGATION: HOW MANY SITES RECEIVE THIS GROUP AT ALL
-
- This number is the percent of the sites that are even receiving this
- newsgroup. The information necessary to compute propagation was not generated
- by early versions of the arbitron program, so the "basis" (number of sites)
- used to generate the Propagation figure is smaller than the "Sites in this
- sample" figure. A site's data will be used to compute propagation if either
- (a) it reports zero readers for at least one group, or (b) it is using an
- arbitron with an explicit version number that is high enough.
-
-
- MESSAGES PER MONTH AND KILOBYTES PER MONTH
-
- Traffic is measured at decwrl, in Palo Alto, California. If for some reason
- decwrl has not received any traffic in that newsgroup during the measurement
- period, this is indicated with dashes ("-") in the traffic columns.
-
- Any message that has arrived at decwrl within the last "Traffic measurement
- interval" days is counted, regardless of when it was posted. Monthly rates
- are computed by taking the total traffic, dividing by the number of days in
- the traffic measurement interval, and multiplying by 30.
-
- By definition the message traffic values are correct, because they are an
- exact measurement, but they may differ from the traffic at your site because
- of differences in timing and propagation. Timing differences will be random,
- but will average out in the long run.
-
- If a message is crossposted to several groups, it is charged only to the
- first-named group in the list. Note that this differs from the statistics
- posted from uunet every 2 weeks: the uunet data charge a message equally to
- every group that it is crossposted to.
-
-
- CROSSPOSTING PERCENTAGE: WHAT FRACTION OF THE ARTICLES ARE CROSSPOSTED
-
- "Crossposting" means to post the same article simultaneously in more than one
- newsgroup. In genuine "news" systems crossposting is implemented with Unix
- links and does not increase the storage or transmisison cost, though in some
- other systems crossposted articles are unbundled and must be stored and
- transmitted separately.
-
- The "crossposting percentage" is the percentage of the articles in this group
- that are crossposted to at least one other group. If every article in this
- group is crossposted, the percentage will be 100%; if none is crossposted,
- then the percentage will be 0%. The crossposting percentage figure does not
- take the size of the article into account, only the number of articles.
- Crossposting a 50,000-byte article or a 50-byte article both cause the same
- tally.
-
-
- COST RATIO: DOLLARS PER MONTH PER READER
-
- The most controversial field in the survey report is the "$US per month per
- reader". It is the estimated number of dollars that are being spent on behalf
- of each reader, worldwide, on telephone and computer costs to transmit this
- newsgroup. The rate of $.0025 per kilobyte is the same value used in the
- UUNET statistics reported biweekly. It is based on discussions among system
- administrators about the true cost of news transmission.
-
- The cost ratio is computed as follows:
-
- $US/month/reader = ($USPerMonthPerSite * numberOfSites) / numberOfReaders
- $USPerMonthPersite = KBytesTrafficPerMonth * $USPerKByte * Propagation factor
- $USPerKByte = 0.0025
-
- Combining all these gives
-
- $USPerMonthPersite =
- KBytesTrafficPerMonth * 0.0025
- = KBytesTrafficPerMonth / 400
-
- Therefore:
-
- $US/month/reader =
- (KBytesTrafficPerMonth * numberOfSites) / (400 * numberOfReaders)
-
- The accuracy of this number is in fact better than the accuracy of the
- participation ratio, because the source of error--the network size
- estimate--is present both in the numerator and the denominator, and therefore
- cancels out. The primary source of bias in this number comes from the bias in
-
-
- the "estimated number of readers, worldwide", which is described above. Treat
- this value as being accurate to within about 25%.
-
-
- SITE PARTICIPATION
-
- I would like to receive data from every site on USENET. The arbitron programs
- (posted comp.sources.d along with this report) work on news 2.9, 2.10.[1-3],
- 2.11, and on many versions of notesfiles.
-
-
- Brian Reid
- Network Systems Laboratory, Digital Equipment Corporation, Palo Alto CA
- reid@pa.dec.com
-
-