home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.compression
- Path: sparky!uunet!caen!sol.ctr.columbia.edu!sees.bangor.ac.uk!mather
- From: mather@sees.bangor.ac.uk (Paul Mather)
- Subject: Compression technique mentioned in PCW
- Organization: Centre for Applied Obfuscation.
- Message-ID: <1992Aug18.155518@sees.bangor.ac.uk>
- Sender: mather@sees.bangor.ac.uk (Mr P Mather)
- Keywords: snake oil, something for nothing
- Date: Tue, 18 Aug 1992 14:55:18 GMT
- X-Posted-From: sol.sees.bangor.ac.uk
- X-Posted-Through: sol.ctr.columbia.edu
- Lines: 50
-
- A friend showed me a letter in a recent issue (Sep 1992) of Personal
- Computer World magazine. The letter was a comment on the article
- PCW had written on "spectacular" data compression. It had focussed
- (somewhat sarcastically) on W.E.B.'s claimed 16:1 method, and, more
- reasonably on Iterated Systems efforts.
-
- Anyway, the thrust of the letter was not about those but was to
- outline a method which seemed to illustrate that it was possible
- to compress data outside the laws of information theory (and
- other such W.E.B.-like claims), and indeed to multiply compress
- the compressor's output.
-
- Sounds like hot air? (Yes, IMHO.:)
-
- Well, the method the person outlined was the familiar one of representing
- the bytes of a data stream as a decimal fraction N, where N is between 0
- and 1. A similar scheme was mentioned here during the WEB flame wars,
- where compression was achieved by marking this number on a rod. Decompression
- involved precisely measuring the marked length and enumerating it. :-)
-
- However, in the compression scheme mentioned in the letter, instead of
- using a stout piece of oak :-), the decimal fraction was compressed by
- determining two numbers, a and b, such that a/b = N. To transmit the
- original file, one need only transmit a and b. Decompression would
- involve calculating a/b and enumerating the result.
-
- A further modification was suggested to the algorithm whereby the
- file is partitioned into arbitrary sized chunks and each chunk treated
- as a decimal number between 0 and 1 as before. This would be done
- in order to keep down the precision of the fractions (N's) to more
- manageable limits and compression would involve finding a's and b's for
- each chunk. (The author of the letter suggested that this list of
- a's and b's could be further compressed using the same technique.)
-
- When my friend asked me what I thought of the proposed method, my gut
- reaction was to say that finding the values of a and b sounded pretty
- intractable to me. It also occurred to me that it was likely that for
- some N the enumeration of the integers a and b could be longer than N
- itself but my number theory is very rusty so I didn't speculate on that
- to my friend. :-)
-
- Anyway, I just thought I'd mention it.
-
- Cheers,
-
- Paul.
- --
- e-mail: p.mather@sees.bangor.ac.uk
-
- If your mailer can't reach me, I'm obviously not worth talking to.
-