home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky sci.crypt:3828 alt.security:4582
- Newsgroups: sci.crypt,alt.security
- Path: sparky!uunet!infonode!ingr!b30news!craig!craig
- From: craig@jido.b30.ingr.com (Craig Presson)
- Subject: Re: Letter Frequency
- In-Reply-To: ian@pharaoh.cyborg.bt.co.uk's message of 15 Oct 92 15:32:21 GMT
- Message-ID: <1992Oct16.175533.16731@b30.ingr.com>
- Sender: usenet@b30.ingr.com (Usenet Feed)
- Reply-To: craig@jido.b30.ingr.com
- Organization: Intergraph Corporation, Huntsville, Alabama
- References: <1big1qINNrnq@matt.ksu.ksu.edu> <6989@pharaoh.cyborg.bt.co.uk>
- Date: Fri, 16 Oct 1992 17:55:33 GMT
- Lines: 53
-
- holland@matt.ksu.ksu.edu (Rich Holland) writes:
-
- >I'm in dire need of a letter frequency chart for the English language. I
- >remember as a kid reading books on basic cryptanalysis and seeing these
- >charts of the most frequently used letter (like "E" is most often used,
- >then "S" or "R" or something, etc). I don't remember the order, but now
- >I need it. Anyone got a copy of a table of something like this online?
-
- >If not, got a source where I can go look it up quick?
-
- The frequency distribution that you get varies somewhat depending on
- your source of text. Here is the table from Appendix C of Sinkov:
-
- (#'s per 1000 letters)
-
- E 130 T 93 N 78 R 77 I 74 O 74 A 73 S 63 D 44 H 35 L 35 C 30 F 28
-
- P 27 U 27 M 25 Y 19 G 16 W 16 V 13 B 9 X 5 K 3 Q 3 J 2 Z 1
-
-
- You get something a bit different if you just run a frequency test
- on the dictionary (maybe I have an odd dictionary, too):
-
- $ freq < /usr/dict/words | sort +1 -nr | pg
- E 36119
- A 30747
- I 26179
- R 23702
- O 22767
- T 22545
- N 22412
- S 22094
- L 17688
- C 17458
- D 13054
- U 10767
- M 9780
- B 9231
- P 8539
- H 7922
- G 7817
- Y 5766
- F 3946
- V 3285
- K 3046
- W 2603
- Z 1041
- X 862
- Q 627
- J 626
-
-
- -- Craig
-