NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / compress / 3035 < prev next >

Wrap

Text File | 1992-08-18 | 2.8 KB | 64 lines

Newsgroups: comp.compression Path: sparky!uunet!caen!sol.ctr.columbia.edu!sees.bangor.ac.uk!mather From: mather@sees.bangor.ac.uk (Paul Mather) Subject: Compression technique mentioned in PCW Organization: Centre for Applied Obfuscation. Message-ID: <1992Aug18.155518@sees.bangor.ac.uk> Sender: mather@sees.bangor.ac.uk (Mr P Mather) Keywords: snake oil, something for nothing Date: Tue, 18 Aug 1992 14:55:18 GMT X-Posted-From: sol.sees.bangor.ac.uk X-Posted-Through: sol.ctr.columbia.edu Lines: 50 A friend showed me a letter in a recent issue (Sep 1992) of Personal Computer World magazine. The letter was a comment on the article PCW had written on "spectacular" data compression. It had focussed (somewhat sarcastically) on W.E.B.'s claimed 16:1 method, and, more reasonably on Iterated Systems efforts. Anyway, the thrust of the letter was not about those but was to outline a method which seemed to illustrate that it was possible to compress data outside the laws of information theory (and other such W.E.B.-like claims), and indeed to multiply compress the compressor's output. Sounds like hot air? (Yes, IMHO.:) Well, the method the person outlined was the familiar one of representing the bytes of a data stream as a decimal fraction N, where N is between 0 and 1. A similar scheme was mentioned here during the WEB flame wars, where compression was achieved by marking this number on a rod. Decompression involved precisely measuring the marked length and enumerating it. :-) However, in the compression scheme mentioned in the letter, instead of using a stout piece of oak :-), the decimal fraction was compressed by determining two numbers, a and b, such that a/b = N. To transmit the original file, one need only transmit a and b. Decompression would involve calculating a/b and enumerating the result. A further modification was suggested to the algorithm whereby the file is partitioned into arbitrary sized chunks and each chunk treated as a decimal number between 0 and 1 as before. This would be done in order to keep down the precision of the fractions (N's) to more manageable limits and compression would involve finding a's and b's for each chunk. (The author of the letter suggested that this list of a's and b's could be further compressed using the same technique.) When my friend asked me what I thought of the proposed method, my gut reaction was to say that finding the values of a and b sounded pretty intractable to me. It also occurred to me that it was likely that for some N the enumeration of the integers a and b could be longer than N itself but my number theory is very rusty so I didn't speculate on that to my friend. :-) Anyway, I just thought I'd mention it. Cheers, Paul. -- e-mail: p.mather@sees.bangor.ac.uk If your mailer can't reach me, I'm obviously not worth talking to.