NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / compress / 4796 < prev next >

Wrap

Internet Message Format | 1993-01-25 | 3.5 KB

Path: sparky!uunet!pipex!warwick!uknet!strath-cs!bnr.co.uk!stl!crosfield!rak From: rak@crosfield.co.uk (Richard Kirk) Newsgroups: comp.compression Subject: Re: JPEG compression Keywords: JPEG, compression Message-ID: <15979@suns6.crosfield.co.uk> Date: 25 Jan 93 10:31:31 GMT References: <peterb.727926960@wambenger> Organization: Crosfield Electronics, Hemel Hempstead, United Kingdom. Lines: 61 In article <peterb.727926960@wambenger> peterb@cs.uwa.oz.au (Peter Bousfield) writes: >It is my understanding that JPEG compression >works in basically two parts. >Step 1: DCT (Discrete Cosine Transform) >Step 2: Huffman Coding >I was wanting to know which of these >two steps contributes more to the compression. This is a bit like.. Which contributes more to a car going fast - the wheels or the engine? ..but here goes... If compression is to work your data probably has some underlying structure although it may not be obvious from a dump of the bytes. You can get a histogram of the pixel values from an image. Typically this will have peaks in it as images have patches of this or that in it, but often the patches will be rather broad because real objects have shading in them. You could create a Huffmann table for the straight pixel values and that might give you 10% saving. The first thing that might strike you when you see a dump of image values is that one pixel value is very often similar to the one before it or above it or below it or beside it. If you coded not the pixel values but their differences (vertically, horizontally, or both) then you would find many of the difference values were on or near zero. Your histogram would have a sharp peak, and Huffmann coding might give you a 50% saving on a scanned image. This is a specific case of a transformation to lower entropy. The goal is to transform your numerical data into a form where all the inherent structure is now revealed in the frequency of the string of digital values. The DCT is just a rather more complicated form of differencing and averaging. When doing the differencing as described above the saving might be only 50% for real scanned images because there might be +/-2 units noise on the scan. [ This does not mean you have a crummy scanner - it would be difficult to get a smooth scan of a grey wedge without this noise, but that's another story. ] This would mean that 3 out of your 8 bits were unpredictable noise, and so could not be reconstructed by some algorithm however smart. So - why not replace the noise with other noise? It may not be the same noise but that does not affect how the image looks, provided we preserve the local average and all the broad features. It is this stage - the quantization after the DCT where we truncate the DCT real values to integer intervals - that we dump our 3-bits-out-of-8 overhead ball and chain, and start to get some real compression done. In the 1-D DCT the first value gives the average value of the 8-pixel interval, the second gives the slope (sort-of), the third gives you the curvature in the slope - it's a bit like a polynomial - and so a lot of the properties we see as being important (like the average and the slopes) are retained. Which contributes more to a car going fast - the wheels or the engine? Ha! Don't forget the gearbox! P.S Sorry this explanation is a bit rough, but it is Monday. -- Richard Kirk Image Processing Group Crosfield Electronics Ltd. U.K. rak@crosfield.co.uk 0442-230000 x3361/3591 Hemel Hempstead, Herts, HP2 7RH