NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / compress / 4805 < prev next >

Wrap

Internet Message Format | 1993-01-25 | 15.8 KB

Path: sparky!uunet!zaphod.mps.ohio-state.edu!usc!randvax!edhall From: edhall@rand.org (Ed Hall) Newsgroups: comp.compression Subject: Re: best way to describe jpeg Message-ID: <4258@randvax.rand.org> Date: 25 Jan 93 23:49:45 GMT References: <1993Jan17.2090.5297@dosgate> <dak.727313698@hathi> <C12LuC.C8.2@cs.cmu.edu> Sender: news@randvax.rand.org Organization: RAND Lines: 288 Nntp-Posting-Host: ives.rand.org I (and the net) might regret this, but here is my attempt to give a an intelligent layman's description of JPEG. My target audiences are image archivists and collectors who are sceptical of JPEG and perhaps misled by the term "lossy compression" and by misinformed articles on BBSes, on-line services, and the net. Anyone can take this article and do what they want with it--I abandon any copyright to it, although I'd appreciate receiving any improvements or other suggestions. Keep in mind that I have tried hard to minimize the amount of technical background assumed of the reader, so don't criticize me for not explaining Floyd-Steinberg dithering or whatever. -Ed Hall edhall@rand.org JPEG: What Is It? Why Use It? by Ed Hall This document is intended to be a simple description of the JPEG image compression technique, providing you with a general "feel" for the process and its strengths and failings. It is not a technical description of JPEG; there are several articles and papers, and even a book, which describe JPEG in depth. Also, I'm confining my discription to the use of JPEG for "lossy" compression of color images, although the JPEG standard itself addresses other uses. In order to understand JPEG you need to know a bit about digital images, and about some characteristics of how people perceive images, so we'll start with a few brief paragraphs on these. We need this background to understand JPEG because JPEG is /lossy/ compression, which is to say that part of the information in the original image is thrown away during compression. JPEG can't be evaluated unless we understand just what is lost, and how that loss affects the final displayed image. The fact that JPEG is "lossy" is seen as a bad thing by many people who are first learning about it. This impression hasn't been helped by some defective JPEG programs that appeared soon after its introduction. However, I think you will see by the end of this document that this sort of "loss" can often be a good thing--in most cases, the loss is in the size of the image file, and not the quality of the viewed image. And the reduction is size can be considerable: usually 3 to 8 times smaller than a GIF image of similar quality (and later we'll see how, with appropriate hardware, JPEG images can appear better than the best GIFs). Color Digital Images Computers handle sounds and images by dividing them up into small pieces and assigning an intensity value to each piece. For images, this usually means dividing the picture up into many small squares, arranged like very fine graph paper, and assigning a number to each square according to the brightness of the image within that square. (These squares are called /pixels/, which is short for "picture elements.") If it is a color picture three numbers are used per square, one each for the intensity of red, green, and blue (often called "RGB" for short). Since almost any color can be represented by a mixture of these three "primary colors," this means that the color of a square can be represented as well as its brightness. Digital images typically start with a TV camera or a device called a /scanner/ (or are generated by a computer program). If we are dealing with color images, the result is the grid of red, green, and blue values we just described. However, there are actually many ways of representing a color image beside using the raw red, green, and blue values; in fact, it is often more convenient to translate each square to a value representing its overall brightness, called /luminance/, and two other values representing its color or /chroma/. The ability to do this is important for JPEG, as we shall soon see. In any case, the image is translated back into red, green, and blue values prior to display. A Few Facts About The Human Visual System There is much that is still unknown about how people see, but through study and experiment we have learned a number of facts which are quite useful in designing systems for storing and displaying computer images. Some of these facts are quite intuitive since we encounter them every day: for example, a field of black and white dots appears grey when viewed at a distance, "fusing" into a sort of average of their brightnesses. Other facts are just as common but probably not quite as intuitive, such as the eye's ability to see changes in brightness in much finer detail than changes in color. Color television relies upon this fact, transmitting much more luminance detail than chroma detail. In fact, JPEG does the same thing (although with a great deal more finesse than color TV). The tendency for fine detail to fuse into a single impression is a bit more complicated than the example with black and white dots suggests. For example, if the dots are instead two different shades of grey, the dots can be quite a bit larger before they are seen as separate dots. In Scientific termonology, this phenomenon is described by saying that the eye is less sensitive to higher frequencies. And in large part it is this reduced sensitivity that makes JPEG possible. Remember this concept of "high frequencies," meaning image details which are very close together, since we will use the term quite heavily later in this document. Note: Be careful not to confuse this concept of "frequency" with that of the electromagnetic spectrum, where blue light is said to have a higher frequency than red light. Our use of the term "frequency" has nothing to do with color, and is more properly described as "spatial frequency." A Few Words About Color GIFs The same visual phenomena we've just described are also very important in producing high-quality color GIFs. This is because color GIF images also involve lossy compression, a result of the GIF's restriction to 256 colors. Many color scanners and digital cameras produce 8 bits--that is, 256 possible values--for each of the three primary colors, for a total of 24 bits--that is, 16777216 possible values--for each pixel. Squeezing this much information into a GIF means picking 256 colors out of the 16777216 possible colors and assigning a color from this limited /palette/ of colors to each pixel. Since we can rely on the human visual system to blend adjacent pixels (the "dots" we talked about in the previous section), we can, if we're smart about how we chose our palette, use what is effectively many more than 256 colors to produce our image. The common technique for doing this, called "dithering," makes precisely this compromise: the accuracy of individual pixels is adjusted to make the average of /groups/ of pixels closer to the color in the original image. The eye is relatively insensitive to the pixel-to-pixel (that is, high-frequency) noise which results from this process. Be very wary when you hear someone claim that color GIFs use "lossless" compression. In fact, as we shall see, the lossy nature of GIFs can lead to problems when GIFs and JPEG are combined. How JPEG works Some magazine articles on JPEG have claimed that "JPEG throws out the high frequencies which the eye can't see." This is misleading--it would be more accurate to say that JPEG represents high frequencies with /less accuracy/ in proportion to the eye's lower sensitivity to such frequencies. If a high-frequency detail is prominent enough for the eye to notice, a reasonable quality JPEG image will represent it. JPEG does its magic through a series of steps. First, JPEG separates the image into luminance (brightness) and chroma (color). Since details in the latter are much less noticible to the eye, it combines adjacent pixels in forming the chroma "image," which results in a considerable savings in space. A full-sized luminance image and two half-sized chroma images are then each compressed separately, according to the methods given in the following paragraphs. Unless we magnify the uncompressed image so that individual pixels become visible, the reduction in color resolution is invisible. The next step divides the images into blocks of 8 by 8 pixels, and compresses each block separately. Using blocks both simplifies the compression step and takes advantage of the fact that some parts of the image will be much more compressible than others. In addition, artifacts (false detail) of the compression process which result from processing high frequencies will be confined to small areas, where they will be (usually) completely obscured by the real details, a phenomenon called /masking/. (Such artifacts are usually quite a bit more subtle than those produced in GIF color reduction, by the way. You'll generally not see them unless you choose a very high compression level, or attempt to compress an inappropriate image such as a line-drawing.) Image blocks are compressed by performing a mathematical operation on them called a "discrete cosine transform" (abbreviated "DCT"). This separates the frequencies in the image block into a set of numbers called /coefficients/. One of these coefficients, which corresponds to the average brightness of the entire block, is separated out for special treatment. This is because the eye is very sensitive to inaccuracies in the overall brightness of adjacent blocks; errors here make the image look "blocky" by making the boundries of the blocks noticible. Also, this particular coefficient tends to change slowly from block to block; this allows an improvement in compression by saving just the difference from one block to the next. The other coefficients are then /quantized/--that is, reduced to a limited number of possible values--with the higher frequencies quantized more coarsely (with fewer possible values) than the lower frequencies. Most of the quantized coefficients will be zero, since in most cases only a few frequencies will exist to a significant (i.e. visible) degree within any given block. Of the remaining few coeffcients, most will have small values. Together, the presence of many successive zeros and of small values in the coefficients allows for very efficient /coding/; that is, few bits are generally needed to represent all the quantized coefficents in the block. A single count of successive zeros represents several coefficents by using a technique similar to run-length compression. Common values of run lengths and of small coefficient values are assigned short sequences of bits, while less frequent values are assigned longer sequences; this is known as "huffman coding," and is quite common in many types of data compression. These bit sequences form the compressed block. JPEG has a compression parameter called the "quality" level which determines the lossiness of the compression. It varies from 0 to 100, and is used to determine the courseness of the quantizations used for the coefficients (see above). Lower quality levels result in many zero coefficents and in small values for the nonzero coefficients, allowing for high degrees of compression at the expense of image bluring and visible artifacts. Somewhere between 75 and 90 (depending upon the image) the resulting image will be visually indistinguishable from the original under normal viewing conditions, while still allowing compression of ten to thirty times over the original RGB image-- equivalent to three to eight times over a GIF of the same image. At the other end of the scale, quality factors of 20 or so result in images which are compressed several more times and are of good enough quality for previewing or indexing, even though they are obviously degraded versions of the original. JPEG And GIF JPEG was designed to work on full-color images where red, green, and blue values are individually kept for each pixel. GIFs only support a single value per pixel, and use a palette of up to 256 colors to approximate the colors in the image. It is easy to convert a GIF into these separate color values so processing like JPEG can be applied, but we have to remember that the GIF has already placed "noise" into the higher frequencies in order to represent colors more faithfully. The eye isn't sensitive to this false detail, but unfortunately JPEG isn't quite as able to ignore it. This means a GIF generally can't be compressed as much as the original RGB image that came from the scanner or camera could without a visible degradation in quality. Certain dithering techniques can actually result in images which cannot be compressed at all without visible artifacts (fortunately, these images seem to be getting less common). Another factor comes into play when JPEGs are turned into GIFs. Since JPEG produces a full-color image, color reduction, as previously described, has to take place before a GIF is produced. There are several things which can result in less-than-optimal results at this point. One is that high-quality dithering can actually take more CPU time than JPEG processing, so JPEG software sometimes cuts corners in this step. Another problem is that the artifacts of color reduction sometimes combine with the artifacts of the JPEG process in such a way as to render them visible. In addition, if the JPEG file was created from a GIF, chances are good that the palette and dithering will differ from the original, leading to a difference in appearance that can easily be interpreted as a degradation. Nonetheless, satisfactory results in GIF-to-JPEG-to-GIF processing are often possible, and some people routinely use JPEG on many of their GIFs. It is probably a good idea to only try for compressions of two to four times when handling GIFs, though, due to the problems just discussed. JPEG And True-color Displays JPEG is at its best advantage when no color reductions occur between the image source and the device (usually a computer's monitor) finally used for display. Unfortunately, most computers are unable to display 24-bit/pixel images, so for these computers color reduction must be done prior to viewing the image. Still, if care is taken at this step the result will be virtually the same as when the color reduction was done to the original 24-bit image in turning it into a GIF. And on a true-color display--rapidly becoming more common as prices for such video cards fall into the under-$200 range--the superior quality of a non-color-reduced image can be seen. However, even if you don't have a true-color video card in your computer now, you'll probably have one at some point in the future. At that point, JPEG-compressed full-color images will look much better than GIF's. So if you're an image collector, it makes sense to get JPEG images whenever you have a choice. A Final Word On Lossy Compression In a world where data storage was free, and data transmission was instantaneous, lossy compression would be useless. But we don't live in a perfect world. Disk space costs money, modems take time and, perhaps, long-distance charges, computer networks bog down as exchanging images becomes more and more common--in short, it makes a lot of sense to use data compression wherever it can reduce such costs. But why use "lossy" compression? Isn't that a bit like accepting damaged goods just because they are cheaper? In a word, no. For example, which is more worthwhile, a 512x384 GIF or a 1024x768 high-quality-level JPEG of the same image, if both cost the same to acquire and store? I don't know about you, but I'd much rather have the JPEG. A quadrupling in transmission and storage costs is much too high a price to pay for a near-invisible increase in image quality--but if I have a 24-bit true-color display, the 4X-larger GIF will actually look inferior! So the real comparison is between the original 24-bit image and JPEG, perhaps an 18:1 difference in size. In brief, not using JPEG can cause "lossy compression" of your wallet...