home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.compression
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!caen!hellgate.utah.edu!dog.ee.lbl.gov!overload.lbl.gov!s1.gov!lip
- From: lip@s1.gov (Loren I. Petrich)
- Subject: Re: more entropy
- Message-ID: <1992Jul24.003709.21603@s1.gov>
- Sender: usenet@s1.gov
- Nntp-Posting-Host: s1.gov
- Organization: LLNL
- References: <1992Jul23.174740.14559@usenet.ins.cwru.edu>
- Date: Fri, 24 Jul 1992 00:37:09 GMT
- Lines: 28
-
- In article <1992Jul23.174740.14559@usenet.ins.cwru.edu> daf10@po.CWRU.Edu (David A. Ferrance) writes:
-
- >If I have an unsigned int count[256][256], what is wrong with
- >calculating entropy like this:
-
- >for (i=0;i<256;i++) for (j=0;j<256;j++) {
- > freq = count[i][j] / total;
- > ent += freq * log10(1/freq) / 0.30103;
- > }
-
- I presume that the code also included:
-
- ent = 0;
- total = 0;
- for (i=0;i<256;i++) for (j=0;j<256;j++)
- total += count[i][j];
-
- >Where total and ent are doubles, total is the # of bytes total, ent
- >starts off as 0, and the values of the array are the # of occurances of
- >each 2 letter combination?
-
- >I get values > 8.
-
- The theoretical maximum value is log2(256*256) = 16.
-
- Yes, some versions of C do have a "log2" function (logarithm
- to base two).
-
-