NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / compress / 4797 < prev next >

Wrap

Internet Message Format | 1993-01-25 | 3.3 KB

Path: sparky!uunet!europa.asd.contel.com!emory!ogicse!das-news.harvard.edu!cantaloupe.srv.cs.cmu.edu!tgl From: tgl+@cs.cmu.edu (Tom Lane) Newsgroups: comp.compression Subject: Re: JPEG compression Summary: it ain't that simple Keywords: JPEG, compression Message-ID: <C1F6I4.675.2@cs.cmu.edu> Date: 25 Jan 93 17:13:09 GMT Article-I.D.: cs.C1F6I4.675.2 References: <peterb.727926960@wambenger> Sender: news@cs.cmu.edu (Usenet News System) Organization: School of Computer Science, Carnegie Mellon Lines: 59 Nntp-Posting-Host: g.gp.cs.cmu.edu peterb@cs.uwa.oz.au (Peter Bousfield) asks: > It is my understanding that JPEG compression works in basically two parts. > Step 1: DCT (Discrete Cosine Transform) > and > Step 2: Huffman Coding ... > > I was wanting to know which of these two steps contributes more to the > compression. That is, how good is the compression without Huffman coding > (eg is it 50% worse or only 10% worse.) Well, it's not really that simple. The DCT is done to make the data more compressible, not to achieve any compression itself. (In fact, the raw volume of data that comes out of the DCT stage is more than what goes into it: 11 bits/pixel instead of 8.) So in one sense the Huffman stage achieves *all* the compression. But obviously that statement doesn't capture what's going on. This is analogous to the distinction between modeling and coding. (I believe the comp.compression FAQ talks about this; or see Mark Nelson's book, or any other recent textbook on data compression.) You express the data in a form that exposes redundancy, then you use a coding technique to remove the redundancy. I think of JPEG as containing three types of steps: 1. Transformation of the data into a more suitable form. This covers two stages: colorspace change and DCT. The data volume is not reduced here, and in fact these steps are 100% reversible except for roundoff error. 2. Data reduction. This likewise covers two stages: downsampling and DCT coefficient quantization. In these steps we deliberately throw away data, being careful to throw away that data which is least visible to the human eye (which the preceding steps have separated out for us). These steps are parameterized so that we can trade off file size against image quality, by adjusting how much data is thrown away. 3. Entropy coding. This is either Huffman or arithmetic coding, depending on the depth of your wallet :-). In this step we efficiently encode the remaining data. This step is lossless (the entropy decoder will exactly reconstruct what was given to the entropy encoder). We rely on the prior steps to have presented us with data in a highly redundant form --- for example, there are typically a LOT of zeroes. I think the initial question might really have meant "can I use a different entropy encoder?". Sure, any lossless compression technique will work for step 3. (In my group's initial experiments with JPEG, we ran the output of step 2 through good old Unix compress, and got reasonable compression.) But the particular variants of Huffman and arithmetic coding specified by the standard are tuned to the characteristics of the data stream that normally emerges from the prior stages. You won't get as good compression if you use a general-purpose compression method. tom lane organizer, Independent JPEG Group