home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
OS/2 Shareware BBS: Graphics
/
Graphics.zip
/
JPEG2OS2.ZIP
/
JPEGFAQ.TXT
< prev
next >
Wrap
Internet Message Format
|
1992-02-07
|
29KB
From: tgl+@cs.cmu.edu (Tom Lane)
Subject: JPEG image compression: Frequently Asked Questions
Summary: trying to clear up some of the confusion
Keywords: JPEG, image compression, FAQ
Organization: School of Computer Science, Carnegie Mellon
Archive-name: jpeg-faq
Last-modified: 24 Jan 1992
This FAQ article discusses JPEG image compression. Suggestions for
additions and clarifications are welcome.
This article includes the following sections:
1) What is JPEG?
2) Why use JPEG?
3) How well does it work?
4) What are good "quality" settings for JPEG?
5) When should I use JPEG, and when should I stick with GIF?
6) Where can I get JPEG software?
7) What's all this hoopla about color quantization?
8) How does JPEG work?
9) What about lossless JPEG?
10) Why all the argument about file formats?
11) And what about arithmetic coding?
12) Does loss accumulate with repeated compression/decompression?
Sections 7 and up can be ignored unless you are interested in details.
1) What is JPEG?
JPEG (pronounced "jay-peg") is a standardized image compression mechanism.
JPEG stands for Joint Photographic Experts Group, the original name of the
committee that wrote the standard. JPEG is designed for compressing either
full-color or gray-scale digital images of "natural" (real-world) scenes.
JPEG does not handle black-and-white (1-bit-per-pixel) images, nor does it
handle motion picture compression. (Standards for compressing those types
of images are being worked on by other committees, named JBIG and MPEG
respectively.)
JPEG is "lossy", meaning that the image you get out of decompression isn't
quite identical to what you originally put in. The algorithm achieves much
of its compression by exploiting known limitations of the human eye, notably
the fact that small color details aren't perceived as well as small details
of light-and-dark. Thus, JPEG is intended for compressing images that will
be looked at by humans. If you plan to machine-analyze your images, the
small errors introduced by JPEG may be a problem for you, even if they are
invisible to the eye.
A useful property of JPEG is that the degree of lossiness can be varied by
adjusting compression parameters. This means that the image maker can trade
off file size against output image quality. You can make *extremely* small
files if you don't mind poor quality; this is useful for indexing image
archives, making thumbnail views or icons, etc. etc. Conversely, if you
aren't happy with the output quality at the default compression setting, you
can jack up the quality until you are satisfied, and accept lesser compression.
2) Why use JPEG?
There are two good reasons: to make your image files smaller, and to store
24-bit-per-pixel color data instead of 8-bit-per-pixel data.
Making image files smaller is a big win for transmitting files across
networks and for archiving libraries of images. Being able to compress a
2 Mbyte full-color file down to 100 Kbytes or so makes a big difference in
disk space and transmission time! (If you are comparing GIF and JPEG, the
size ratio is more like four to one. More details below.)
Unless your viewing software supports JPEG directly, you'll have to convert
JPEG to some other format for viewing or manipulating images. (Right now,
not many viewers include JPEG support, but this will change rapidly.) Thus,
using JPEG is essentially a time/space tradeoff: you give up some time in
order to store or transmit an image more cheaply.
It's worth noting that when network or phone transmission is involved, the
time savings from transferring a shorter file can be much greater than the
extra time to decompress the file. I'll let you do the arithmetic yourself.
The other reason that JPEG will gradually replace GIF is that it stores full
color information: 24 bits/pixel (16 million colors) instead of 8 or less
(256 or fewer colors). If you have only 8-bit display hardware then this
may not seem like much of an advantage to you. Within a couple of years,
though, GIF will look as obsolete as black-and-white MacPaint format does
today. Furthermore, for reasons detailed in section 7, JPEG is far more
useful than GIF for exchanging images among people with widely varying color
display hardware. Hence JPEG is considerably more appropriate than GIF for
use as a Usenet posting standard.
3) How well does it work?
Pretty darn well. Here are some sample file sizes for an image I have handy,
a 727x525 full-color image of a ship in a harbor. The first three files are
for comparison purposes; the rest were created with the free JPEG software
described further on.
File Size in bytes Comments
ship.ppm 1145040 Original file in PPM format (no compression;
3 bytes per pixel plus a few bytes overhead)
ship.ppm.Z 963829 PPM file passed through Unix compress
compress doesn't accomplish a lot, you'll note.
ship.gif 240438 Converted to GIF with ppmquant -fs 256 | ppmtogif
Most of the savings is the result of losing color
info: GIF saves 8 bits/pixel, not 24. (See sec. 7.)
ship.jpg95 155607 cjpeg -Q 95 (highest useful quality setting)
This is indistinguishable from the 24-bit original,
at least to my nonprofessional eyeballs.
ship.jpg75 57995 cjpeg -Q 75 (default setting)
You have to look mighty darn close to distinguish this
from the original, even with both on-screen at once.
ship.jpg50 38399 cjpeg -Q 50
This has slight defects; if you know what to look
for, you could tell it's been JPEGed without seeing
the original. Still as good image quality as many
recent postings in Usenet pictures groups.
ship.jpg25 25186 cjpeg -Q 25
JPEG's characteristic "blockiness" becomes apparent
at this setting (djpeg -b helps some). I've seen
plenty of Usenet postings that were uglier than this.
ship.jpg5o 6597 cjpeg -Q 5 -o
Blocky, but perfectly satisfactory for preview or
indexing purposes. Note that this file is TINY.
In this case JPEG can make a file that's a factor of four or five smaller
than a GIF of comparable quality (the -Q 75 file is every bit as good as the
GIF, better if you have a full-color display). This seems to be a typical
ratio for real-world scenes.
4) What are good "quality" settings for JPEG?
(Note: the -Q settings discussed in this article apply to the free JPEG
software described in section 6. Other JPEG implementations, such as Image
Alchemy, may use a completely different quality scale.)
The name of the game in using JPEG is to pick the lowest quality setting
(smallest file size) that decompresses into an image indistinguishable from
the original. This setting will vary from one image to another and from one
observer to another, but here are some rules of thumb.
The default quality setting (-Q 75) is very often the best choice. This
setting is about the lowest you can go without expecting to see defects in a
typical image. Try -Q 75 first; if you see defects, then go up. Except for
experimental purposes, never go above -Q 95; saying -Q 100 will produce a
file two or three times as large as -Q 95, but of hardly any better quality.
If the image was less than perfect quality to begin with, you might be able
to go down to -Q 50 without objectionable degradation. On the other hand,
you might need to go to a HIGHER quality setting to avoid further degradation.
This last case seems to apply most of the time when converting GIFs to JPEG
(see next section for more information).
If you are prepared to tolerate visible defects, and want a very small file
(say for preview or indexing purposes), a -Q setting in the range of 5 to 10
is about right. -Q 2 or so may be amusing as "op art".
Another recommendation: when you are making a final version of an image for
posting on Usenet or archiving, specify "-o" to cjpeg. This will make the
file a little smaller without affecting image quality; it will take longer
to do the compression, but not any longer to decompress.
5) When should I use JPEG, and when should I stick with GIF?
As a rule of thumb, JPEG is superior to GIF for storing full-color and
gray-scale images of "realistic" scenes; that means scanned photographs and
similar material. JPEG is superior even if you don't have 24-bit display
hardware, and it is a LOT superior if you do. (See section 7 for details.)
GIF does significantly better on images with only a few distinct colors,
such as cartoons and line drawings. In particular, large areas of pixels
that are all *exactly* the same color are compressed very efficiently indeed
by GIF. JPEG can't squeeze these files as much as GIF does without
introducing visible defects. This sort of image is best kept in GIF form.
(Incidentally, many people have developed an odd habit of putting a large
constant-color border around a GIF image. This was nearly free in terms of
storage cost in GIF files. It is NOT free in JPEG files. Do yourself a
favor and don't add a border; let your viewer blank out unused screen area.
If you're converting a GIF to JPEG, crop off any border first.)
JPEG also has a hard time with very sharp edges: a row of pure-black pixels
adjacent to a row of pure-white pixels, for example. Sharp edges tend to
come out blurred unless you use a very high quality setting. Again, this
sort of thing is not found in scanned photographs, but it shows up fairly
often in GIF files: borders, overlaid text, etc. The blurriness is
particularly objectionable with text that's only a few pixels high. If you
have a GIF with a lot of small-size overlaid text, don't JPEG it.
Computer-drawn images (ray-traced scenes, for instance) usually fall between
scanned images and cartoons in terms of complexity. The more complex and
subtly rendered the image, the more likely that JPEG will do well on it.
If you have an existing library of GIF images, you may wonder whether you
should convert them to JPEG. You will lose some image quality if you do,
but the disk space savings may justify converting anyway. (Section 7, which
argues that JPEG image quality is superior to GIF, only applies if both
formats start from a full-color original. If you start from a GIF, you've
already irretrievably lost a great deal of information; JPEG can only make
things worse.)
Experience to date suggests that large, high-quality GIFs are the best
candidates for conversion to JPEG. They chew up the most storage so offer
the most potential savings, and they convert to JPEG with least degradation.
Don't waste your time converting any GIF much under 100 Kbytes. Also, don't
expect JPEG files converted from GIFs to be as small as those created
directly from full-color originals. To maintain image quality you may have
to let the converted files be as much as twice as big as straight-through
JPEG files would be (i.e., shoot for 1/2 or 1/3rd the size of the GIF file,
not 1/4th as shown in the earlier comparisons). A -Q setting of 85 to 95
will minimize the additional degradation caused by converting a GIF to JPEG.
6) Where can I get JPEG software?
If you are looking for "canned" software, viewers, etc:
There is not a lot of pre-built, no-thought-required JPEG software available
yet. This short list will no doubt grow with time.
For X Windows, John Bradley's XV version 2.00 is the best JPEG viewer around.
It's available for FTP from export.lcs.mit.edu or grip.cis.upenn.edu. The
file is called 'xv-2.00.tar.Z' and is located in the 'contrib' directory on
export or the 'pub' directory on grip. XV's only real shortcoming is that
it does not fully exploit 24-bit displays (it reduces all images to 8 bits).
If you have a 24-bit display you will get better results from "xloadimage",
which is also available from export, file contrib/xloadimage.3.01.tar.Z.
This version does not read JPEG files, but it will read the PPM files put
out by the free JPEG converter described below. There is also a patched
version called "xli" (see files xli.* in same directory) that does read JPEG
directly. However, xli is a quick hack rather than an official release;
caveat user.
For MS-DOS, Handmade Software offers two (rather pricy) shareware programs:
Image Alchemy and GIF2JPG/JPG2GIF (contact hsi@netcom.com for details). The
PC versions of these programs are available for FTP from wuarchive.wustl.edu,
directory mirrors/msdos/graphics, files alchmy15.zip and gif2jpg5.zip; also
from SIMTEL20 and its other mirror sites. (Image Alchemy is also available
as an executable for Sun Unix machines, but I don't know where to find it.)
GIF2JPG/JPG2GIF only perform JPEG<=>GIF format conversion. Image Alchemy
converts files between these and many other formats, and can also display
images on some types of hardware. The display option is pretty limited,
so you'll still want a separate viewer program. (WARNING: GIF2JPG produces
a proprietary file format unless you specify -j. Be sure to use -j if you
want to exchange JPEG files with other Usenet users. For that matter, it's
not real clear that you should be posting JPEG files made from GIFs; see
section 5.)
For the Macintosh, Storm Technology has released a free program that can
decode and view JPEG images (though not create them). This is called
Picture Decompress. Make sure you get version 2.0.1 or later; earlier
versions are not compatible with JFIF file format. This program can be
FTPed from sumex-aim.stanford.edu, directory /info-mac/app, file
picture-decompress-201.hqx. You'll also need a tool for adjusting file type
codes; set the type of a downloaded image file to 'JPEG' to allow Picture
Decompress to open it.
If none of the above fits your situation, you can obtain and compile the
free JPEG converter program described below. You'll also need a viewer
program, and if your viewer only handles GIF files, you'll want a separate
color quantization program (we recommend ppmquant from the PBMPLUS package
for Unix machines; on PCs, try Piclab). This last requirement will go away
with the next release of the free code.
There are numerous commercial JPEG offerings, with more popping up every
day. I recommend that you not spend money on one of these unless you find
the available free or shareware software vastly too slow. In that case,
purchase a hardware-assisted product. Ask pointed questions about whether
the product complies with the final JPEG standard and about whether it can
handle the JFIF file format; many of the earliest commercial releases are
not and never will be compatible with anyone else's files.
If you are looking for source code to work with:
Free, portable C code for JPEG compression is available from the Independent
JPEG Group, which I lead. A package containing our source code,
documentation, and some small test files is available from several places.
The "official" archive site for this source code is ftp.uu.net (137.39.1.9
or 192.48.96.9). Look under directory /graphics/jpeg; the current release
is jpegsrc.v2.tar.Z. (This is a compressed TAR file; don't forget to
retrieve in binary mode.) You can retrieve this file by FTP or UUCP. Folks
in Europe may find it easier to FTP from nic.funet.fi (see directory
pub/graphics/programs/jpeg). The source code is also available on
CompuServe, in the GRAPHSUPPORT forum (GO PICS), library 10, as jpsrc2.zip.
The free JPEG code provides conversion between JPEG "JFIF" format and image
files in PBMPLUS PPM, Utah RLE, Truevision Targa, and GIF file formats.
(However, output to GIF format is not of high quality at present; ditto for
colormapped Targa and RLE formats.) The core compression and decompression
modules can easily be reused in other programs, such as image viewers. The
package is highly portable; we have tested it on many machines ranging from
PCs to Crays.
We have released this software for both noncommercial and commercial use.
Companies are welcome to use it as the basis for JPEG-related products.
We do not ask a royalty, although we do ask for an acknowledgement in
product literature (see the README file in the distribution for details).
We hope to make this software industrial-quality --- although, as with
anything that's free, we offer no warranty and accept no liability.
The Independent JPEG Group is a volunteer organization; if you'd like to
contribute to improving our software, you are welcome to join.
If you are not reasonably handy at configuring and installing portable C
programs, you may have some difficulty installing the free source code.
Steve Davis (strat@cis.ksu.edu) has volunteered to maintain an archive of
pre-built executable versions of the free JPEG code for various machines.
His FTP archive is at procyon.cis.ksu.edu (129.130.10.80 -- this number is
due to change soon); look under /pub/JPEG to see what he currently has.
(The administrators of this system ask that FTP traffic be limited to
non-prime hours.) This archive is not maintained by the Independent JPEG
Group, and files in it may not represent the latest source code.
7) What's all this hoopla about color quantization?
Most people don't have full-color (24 bit per pixel) display hardware.
Typical display hardware stores 8 or fewer bits per pixel, so it can display
256 or fewer distinct colors at a time. To display a full-color image, the
computer must map the image into an appropriate set of representative
colors. This process is called "color quantization".
Clearly, color quantization is a lossy process. It turns out that for most
images, the details of the color quantization algorithm have MUCH more impact
on the final image quality than do any errors introduced by JPEG (except at
the very lowest JPEG quality settings).
Since JPEG is inherently a full-color format, converting a JPEG image for
display on 8-bit-or-less hardware requires color quantization. But a GIF
image, by definition, has already been quantized to 256 or fewer colors.
For purposes of Usenet picture distribution, GIF has the advantage that the
sender precomputes the color quantization and recipients don't have to.
This is also the *disadvantage* of GIF: you're stuck with the sender's
quantization. If the sender quantized to a different number of colors than
what you can display, you have to re-quantize, resulting in much poorer
image quality than if you had quantized once from a full-color image.
Furthermore, if the sender didn't use a high-quality color quantization
algorithm, you're out of luck.
For this reason, JPEG offers the promise of *significantly better* image
quality for all users whose machines don't match the sender's display
hardware. JPEG's full color image can be quantized to precisely match the
user's display hardware. Furthermore, you will be able to take advantage of
future improvements in quantization algorithms (there is a lot of active
research in this area), or purchase better display hardware, to get a better
view of JPEG images you already have. With GIF, you're stuck forevermore
with what was sent.
It's also worth mentioning that many GIF-viewing programs include rather
shoddy quantization routines. If you view a 256-color GIF on a 16-color EGA
display, for example, you are probably getting a much worse image than you
need to. This is partly an inevitable consequence of doing two color
quantizations (one to create the GIF, one to display it), but often it's
also due to sloppiness. JPEG conversion programs will be forced to use
high quality quantizers in order to get acceptable results at all, and in
normal use they will quantize directly to the number of colors to be
displayed. Thus, JPEG is likely to provide better results than the average
GIF program for low-color-resolution displays as well as high-resolution ones!
(Incidentally, the current "V2" release of the free JPEG software does NOT
include a good color quantizer; we assume you have a separate quantizer
program, such as ppmquant from the PBMPLUS package. For this reason we
don't recommend using the free software as a JPEG->GIF converter. This
shortcoming will be fixed in the next release.)
8) How does JPEG work?
The buzz-words to know are chrominance subsampling, discrete cosine
transforms, coefficient quantization, and Huffman or arithmetic entropy
coding. This article's long enough already, so I'm not going to say more
than that. For a good technical introduction, see:
Wallace, Gregory K. "The JPEG Still Picture Compression Standard",
Communications of the ACM, April 1991 (vol. 34 no. 4), pp. 30-44.
(Adjacent articles in that issue discuss MPEG motion picture compression,
applications of JPEG, and related topics.) If you don't have the CACM issue
handy, a PostScript file containing a revised version of this article is
available at ftp.uu.net, graphics/jpeg/wallace.ps.Z. The file (actually a
preprint for an article to appear in IEEE Trans. Consum. Elect.) omits the
sample images that appeared in CACM, but it includes corrections and some
added material. Note: the Wallace article is copyright ACM and IEEE, and
it may not be used for commercial purposes.
9) What about lossless JPEG?
There's a great deal of confusion on this subject. The JPEG committee did
define a truly lossless compression algorithm, i.e., one that guarantees the
final output is bit-for-bit identical to the original input. However, this
lossless mode has almost nothing in common with the regular, lossy JPEG
algorithm. As far as I know, the lossless JPEG mode is not implemented in
any software available to the public.
Saying "-Q 100" to the free JPEG software DOES NOT get you a lossless image.
What it does get rid of is deliberate information loss in the coefficient
quantization step. There is still a good deal of information loss in the
color subsampling step. (There should be a command line switch to disable
subsampling, but as of today, there isn't one.)
Even with both quantization and subsampling turned off, the standard JPEG
algorithm is not truly lossless, because it is subject to roundoff errors in
various calculations. The maximum error is a few counts in any one pixel
value; it's highly unlikely that this could be perceived by the human eye,
but it might be a concern if you are doing machine processing of an image.
At this minimum-loss setting, standard JPEG produces files that are perhaps
half the size of an uncompressed 24-bit-per-pixel image. JPEG's true
lossless mode provides roughly the same amount of compression, but it
guarantees bit-for-bit accuracy.
If you have an application requiring lossless storage of images with less
than 6 bits per pixel (per color component), you may want to look into the
JBIG bilevel image compression standard. This performs better than JPEG
lossless on such images. JPEG lossless is superior to JBIG on images with
8 or more bits per pixel; furthermore, it is public domain, while the JBIG
techniques are heavily covered by patents.
10) Why all the argument about file formats?
Strictly speaking, JPEG refers only to a family of compression algorithms;
it does *not* refer to a specific image file format. The JPEG committee was
prevented from defining a file format by turf wars within the international
standards organizations.
Since we can't actually exchange images with anyone else unless we agree on
a common file format, this leaves us with a problem. In the absence of
official standards, a lot of JPEG program writers have just gone off to
"do their own thing", and as a result their programs aren't compatible with
anybody else's.
The closest thing we have to a de-facto standard JPEG format is some work
that's been coordinated by people at C-Cube Microsystems. They have defined
two JPEG-based file formats:
* JFIF (JPEG File Interchange Format), a "low-end" format that transports
pixels and not much else.
* TIFF/JPEG, an extension of the Aldus TIFF format. TIFF is a "high-end"
format that will let you record just about everything you ever wanted to
know about an image, and a lot more besides :-). TIFF is a lot more
complex than JFIF, and may well prove less transportable, because
different vendors have historically implemented slightly different and
incompatible subsets of TIFF. It's not likely that adding JPEG to the mix
will do anything to improve this situation.
Both of these formats were developed with input from all the major vendors
of JPEG-related products; it's reasonably likely that future commercial
products will adhere to one or both standards. (However, as of right now,
January 1992, it's too early for many such products to have appeared.)
A particular case that people may be interested in is Apple's QuickTime
software for the Macintosh. QuickTime uses a JFIF-compatible format wrapped
inside the Mac-specific PICT structure. Conversion between JFIF and
PICT/JPEG should be pretty straightforward; in fact Apple may release a
utility for the purpose.
I believe that Usenet should adopt JFIF as the replacement for GIF in
picture postings. JFIF is simpler than TIFF and is available now; the
TIFF/JPEG spec is still being hammered out. Even when TIFF/JPEG is
available, the JFIF format is likely to be a widely supported "lowest
common denominator"; TIFF/JPEG files may never be as transportable.
11) And what about arithmetic coding?
The JPEG spec defines two different "back end" modules for the final output
of compressed data: either Huffman coding or arithmetic coding is allowed.
The choice has no impact on image quality, but arithmetic coding usually
produces a smaller compressed file. On typical images, arithmetic coding
produces a file 5 or 10 percent smaller than Huffman coding. (All the
file-size numbers previously cited are for Huffman coding.)
Unfortunately, the particular variant of arithmetic coding specified by the
JPEG standard is subject to patents owned by IBM, AT&T, and Mitsubishi.
Thus *you cannot legally use arithmetic coding* unless you obtain licenses
from these companies. (The "fair use" doctrine allows people to implement
and test the algorithm, but actually storing any images with it is dubious
at best.)
At least in the short run, I recommend that people not worry about
arithmetic coding; the space savings isn't great enough to justify the
potential legal hassles. In particular, arithmetic coding *should not*
be used for any images to be exchanged on Usenet.
There is some small chance that the legal situation may change in the
future. Stay tuned for further details.
12) Does loss accumulate with repeated compression/decompression?
It would be nice if, having compressed an image with JPEG, you could
decompress it, manipulate it (crop off a border, say), and recompress it
without any further image degradation beyond what you lost initially.
Unfortunately THIS IS NOT THE CASE. In general, recompressing an altered
image loses more information, though (usually) not as much as was lost the
first time around.
The next best thing would be that if you decompress an image and recompress
it *without changing it* then there is no further loss, i.e., you get an
identical JPEG file. Even this is not true; at least, not with the current
free JPEG software. It's essentially a problem of accumulation of roundoff
error. If you repeatedly compress and decompress, the image will eventually
degrade to where you can see visible changes from the first-generation
output. (It usually takes many such cycles to get visible change.)
One of the things on our to-do list is to see if accumulation of error can
be avoided or limited, but I am not hopeful about it.
In any case, the most that could possibly be guaranteed would be that
compressing the unmodified full-color output of djpeg, at the original
quality setting, would introduce no further loss. Even such simple changes
as cropping off a border could cause further roundoff-error degradation.
(If you're wondering why, it's because the pixel-block boundaries move.
If you cropped off only multiples of 16 pixels, you might be safe, but
that's a mighty limited capability!)
The bottom line is that JPEG is a useful format for archival storage and
transmission of images, but you don't want to use it as an intermediate
format for sequences of image manipulation steps. Use a lossless format
(PPM, RLE, TIFF, etc) while working on the image, then JPEG it when you are
ready to file it away. Aside from avoiding degradation, you will save a lot
of compression/decompression time this way :-).
---------------------
For more information about JPEG in general or the free JPEG software in
particular, contact the Independent JPEG Group at jpeg-info@uunet.uu.net.
--
tom lane
organizer, Independent JPEG Group
Internet: tgl@cs.cmu.edu BITNET: tgl%cs.cmu.edu@carnegie