home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!europa.asd.contel.com!emory!ogicse!das-news.harvard.edu!cantaloupe.srv.cs.cmu.edu!tgl
- From: tgl+@cs.cmu.edu (Tom Lane)
- Newsgroups: comp.compression
- Subject: Re: JPEG compression
- Summary: it ain't that simple
- Keywords: JPEG, compression
- Message-ID: <C1F6I4.675.2@cs.cmu.edu>
- Date: 25 Jan 93 17:13:09 GMT
- Article-I.D.: cs.C1F6I4.675.2
- References: <peterb.727926960@wambenger>
- Sender: news@cs.cmu.edu (Usenet News System)
- Organization: School of Computer Science, Carnegie Mellon
- Lines: 59
- Nntp-Posting-Host: g.gp.cs.cmu.edu
-
- peterb@cs.uwa.oz.au (Peter Bousfield) asks:
- > It is my understanding that JPEG compression works in basically two parts.
- > Step 1: DCT (Discrete Cosine Transform)
- > and
- > Step 2: Huffman Coding ...
- >
- > I was wanting to know which of these two steps contributes more to the
- > compression. That is, how good is the compression without Huffman coding
- > (eg is it 50% worse or only 10% worse.)
-
- Well, it's not really that simple. The DCT is done to make the data more
- compressible, not to achieve any compression itself. (In fact, the raw
- volume of data that comes out of the DCT stage is more than what goes into
- it: 11 bits/pixel instead of 8.) So in one sense the Huffman stage
- achieves *all* the compression. But obviously that statement doesn't
- capture what's going on.
-
- This is analogous to the distinction between modeling and coding. (I believe
- the comp.compression FAQ talks about this; or see Mark Nelson's book, or any
- other recent textbook on data compression.) You express the data in a form
- that exposes redundancy, then you use a coding technique to remove the
- redundancy.
-
- I think of JPEG as containing three types of steps:
-
- 1. Transformation of the data into a more suitable form.
- This covers two stages: colorspace change and DCT.
- The data volume is not reduced here, and in fact these steps are 100%
- reversible except for roundoff error.
-
- 2. Data reduction.
- This likewise covers two stages: downsampling and DCT coefficient
- quantization.
- In these steps we deliberately throw away data, being careful to throw
- away that data which is least visible to the human eye (which the
- preceding steps have separated out for us). These steps are
- parameterized so that we can trade off file size against image quality,
- by adjusting how much data is thrown away.
-
- 3. Entropy coding.
- This is either Huffman or arithmetic coding, depending on the depth of
- your wallet :-).
- In this step we efficiently encode the remaining data. This step is
- lossless (the entropy decoder will exactly reconstruct what was given
- to the entropy encoder). We rely on the prior steps to have presented us
- with data in a highly redundant form --- for example, there are typically
- a LOT of zeroes.
-
- I think the initial question might really have meant "can I use a different
- entropy encoder?". Sure, any lossless compression technique will work for
- step 3. (In my group's initial experiments with JPEG, we ran the output of
- step 2 through good old Unix compress, and got reasonable compression.)
- But the particular variants of Huffman and arithmetic coding specified by
- the standard are tuned to the characteristics of the data stream that
- normally emerges from the prior stages. You won't get as good compression
- if you use a general-purpose compression method.
-
- tom lane
- organizer, Independent JPEG Group
-