home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!mcsun!uknet!acorn!armltd!dseal
- From: dseal@armltd.co.uk (David Seal)
- Newsgroups: comp.compression
- Subject: Re: PERPETUAL MOTION
- Message-ID: <4675@armltd.uucp>
- Date: 27 Jul 92 14:49:16 GMT
- References: <62692@cup.portal.com>
- Sender: dseal@armltd.uucp
- Distribution: comp
- Organization: A.R.M. Ltd, Swaffham Bulbeck, Cambs, UK
- Lines: 108
-
- In article <62692@cup.portal.com> ts@cup.portal.com (Tim W Smith) writes:
-
- >Most impossible compressors claim to only work on files past a certain
- >size. Let that size be M bits. Let F(X) be the total number of files
- >of X bits or less. Let f(X) be the total number of files of X bits.
- >
- >Assume we have some ordering of all files. With each file, associate
- >a number, which is the position of the file in this ordering relative
- >to the first file of the same size.
- >
- >Now consider the following compressor. For files of length M+1, the
- >first F(M) of them are mapped to files of length M or less. The rest
- >of the files of length M+1 are mapped to the files whose number is
- >F(M) less than theirs.
- >
- >This leaves us F(M) files of length M+1 that do not have files mapped
- >onto them.
- >
- >We can now do a similar thing for files of length M+2. ...
- >
- >Repeat forever. At each step, there are F(M) files that compress, and
- >applying this to a file of size N at most f(N)/F(M) times will compress
- >it.
- >
- >I don't recall all the WEB claims, but I think that this satisfies them
- >in spirit.
- >
- >...
- >
- >However, I believe that it *DOES* provide a counterexample to the
- >usual counting argument.
-
- It does *NOT* provide such a counterexample. The counting argument *only*
- applies to lossless compression methods. Your methods are lossy: the
- counting argument simply doesn't apply to them, and so they cannot be
- counterexamples to the counting argument.
-
- To see why your methods are lossy:
-
- A single application of the method outlined above is lossy because it has
- got to do *something* with input files of length M or less. The claims about
- always compressing only apply to files of length greater than M, so such
- input files don't need to be compressed: in fact, they may be expanded
- drastically if you want. But they do have to map to *some* output file.
-
- But the method uses all possible output files to represent the input files
- of length greater than M: whatever output file you choose for any input file
- of length M or less has already been used for a file of length greater than
- M. So the method maps two distinct input files to the same output file, and
- so is lossy.
-
- If you like, you can refuse to apply the method to input files of length M
- or less. But if you're then given a "compressed" file of length M or less,
- you cannot tell whether the original file was identical to the output file
- or was the file of length M+1 that compressed down to the output file.
- Again, the method is lossy.
-
- You can get around this by adding an extra bit to each output file, to say
- whether it needs to be decompressed or is a verbatim length M or less file.
- This is lossless, but unfortunately this extra bit has more than negated all
- the compression you've achieved...
-
- Now to deal with the "multiple applications, until the file is length M or
- less" technique: the length M or less output file is not enough to
- reconstruct the input file, so the technique is lossy. To make it lossless,
- you also need to record the number of times that you need to apply the
- decompression algorithm to it. Unfortunately, this number is liable to be
- rather big: so big in fact that any technique for adding some representation
- of it to the file will either result in some files having been increased in
- size or in all files being restored to their original size...
-
- To summarise:
- * The counting argument applies only to lossless techniques.
- * A compression technique must specify what it does to all possible input
- files - even if you're only interested in what it does to files greater
- than a certain length.
- * A compression technique is lossless if its output file *always* contains
- enough information to reproduce the input file (using a fixed
- decompression algorithm and without relying on any information
- surreptitiously hidden elsewhere than in the output file). It is lossy
- otherwise: in particular, if two distinct input files produce the same
- output file, it is lossy.
- * For a "apply compression algorithm X until condition C is true"
- compression technique to be lossless, the decompression program must not
- merely be capable of applying decompression algorithm X; it must also be
- capable of establishing *how many times* to apply decompression
- algorithm X.
-
- A general suggestion: if you seriously think you've found a counterexample
- to the counting argument, may I suggest you follow it through to its logical
- conclusion? Such a counterexample would result in a proof that two distinct
- integers are equal to each other. If you've got a valid proof of this form,
- you could:
-
- (1) Make vast amounts of money by borrowing the smaller of these two
- integers, observing that what you have got is equal to the larger,
- paying back the loan out of the larger, and pocketing the difference...
-
- (2) Become famous as the mathematician who established the inconsistency of
- practically all known mathematics...
-
- Doing either of these successfully would be far more profitable to you
- personally than simply posting to Usenet :-) :-) :-)
-
- David Seal
- dseal@armltd.co.uk
-
- All opinions are mine only...
-