PNG (Portable Network Graphics) Specification, Tenth Draft

Revision date: 5 May, 1995

Previous page

10. Recommendations for Decoders

This chapter gives some recommendations for decoder behavior. The only absolute requirement on a PNG decoder is that it successfully read any file conforming to the format specified in the preceding chapters. However, best results will usually be achieved by following these recommendations.

Chunk error checking

Unknown chunk types must be handled as described under Chunk naming conventions.

It is strongly recommended that decoders verify the CRC on each chunk.

For known-length chunks such as IHDR, decoders should treat an unexpected chunk length as an error. Future extensions to this specification will not add new fields to existing chunks; instead, new chunk types will be added to carry any new information.

Unexpected values in fields of known chunks (for example, an unexpected compression type in the IHDR chunk) should be checked for and treated as errors.

Pixel dimensions

Non-square pixels can be represented (see the pHYs chunk), but viewers are not required to account for them; a viewer may present any PNG file as though its pixels are square.

Conversely, viewers running on display hardware with non-square pixels are strongly encouraged to rescale images for proper display.

Truecolor image handling

To achieve PNG's goal of universal interchangeability, decoders are required to accept all types of PNG image: palette, truecolor, and grayscale. Viewers running on palette-mapped display hardware need to be able to reduce truecolor images to palette form for viewing. This process is usually called "color quantization".

A simple, fast way of doing this is to reduce the image to a fixed palette. Palettes with uniform color spacing ("color cubes") are usually used to minimize the per-pixel computation. For photograph-like images, dithering is recommended to avoid ugly contours in what should be smooth gradients; however, dithering introduces graininess which may be objectionable.

The quality of rendering can be improved substantially by using a palette chosen specifically for the image, since a color cube usually has numerous entries that are unused in any particular image. This approach requires more work, first in choosing the palette, and second in mapping individual pixels to the closest available color. PNG allows the encoder to supply a suggested palette in a PLTE chunk, but not all encoders will do so, and the suggested palette may be unsuitable in any case (it may have too many or too few colors). High-quality viewers will therefore need to have a palette selection routine at hand. A large lookup table is usually the most feasible way of mapping individual pixels to palette entries with adequate speed.

Numerous implementations of color quantization are available. The PNG reference implementation will include code for the purpose.

Decoder gamma handling

To produce correct tone reproduction, a good image display program must take into account the gammas of both the image file and the display device. This can be done by calculating
  gbright := pixelval / MAXPIXVAL
  bright := gbright ^ (1.0 / file_gamma)
  gcvideo := bright ^ (1.0 / display_gamma)
  fbval := ROUND(gcvideo * MAXFBVAL)
where MAXPIXVAL is the maximum pixel value in the file (255 for 8-bit, 65535 for 16-bit, etc), MAXFBVAL is the maximum value of a frame buffer pixel (255 for 8-bit, 31 for 5-bit, etc), pixelval is the value of the pixel in the PNG file, and fbval is the value to write into the frame buffer. The first line converts from pixel code into a normalized 0 to 1 floating point value, the second undoes the encoding of the image file to produce a linear brightness value, the third line pre-corrects for the monitor's gamma response, and the fourth converts to an integer frame buffer pixel. In practice the second and third lines can be merged into
  gcvideo := gbright ^ (1.0 / (file_gamma * display_gamma))
so as to perform only one power calculation. For color images, the entire calculation is performed separately for R, G, and B values.

It is not necessary to perform transcendental math for every pixel! Instead, compute a lookup table that gives the correct output value for every pixel value. This requires only 256 calculations per image (for 8-bit accuracy), not one calculation per pixel. For palette-based images, a one-time correction of the palette is sufficient.

In some cases even computing a gamma lookup table may be a concern. In these cases, viewers are encouraged to have precomputed gamma correction tables for file_gamma values of 1.0 and 0.45 and some reasonable single display_gamma value, and to use the table closest to the gamma indicated in the file. This will produce acceptable results for the majority of real files.

When the incoming image has unknown gamma (no gAMA chunk), choose a likely default file_gamma value, but allow the user to select a new one if the result proves too dark or too light.

In practice, it is often difficult to determine what value of display_gamma should be used. In systems with no built-in gamma correction, the display_gamma is determined entirely by the CRT. Assuming a value of 2.2 is recommended, unless you have detailed calibration measurements of this particular CRT available.

However, many modern frame buffers have lookup tables that are used to perform gamma correction, and on these systems the display_gamma value should be the gamma of the lookup table and CRT combined. You may not be able to find out what the lookup table contains from within an image viewer application, so you may have to ask the user what the system's gamma value is. Unfortunately, different manufacturers use different ways of specifying what should go into the lookup table, so interpretation of the system gamma value is system-dependent.

Here are examples of how to deal with some known systems:

We should point out that there is a fudge factor built into the use of the magic value "2.2" as the assumed CRT gamma in the calculations above. Real CRTs usually have a higher gamma than this, around 2.8 in fact. By doing the display gamma correction for a CRT gamma of only 2.2, we get an image on screen that is slightly higher in contrast than the original scene. This is normal TV and film practice, and we are continuing it here. Generally, writers of display programs are best to assume that CRT gamma is 2.2 rather than using actual measurements.

If you have carefully measured the gamma of your CRT, you might want to set display_gamma to your_CRT_gamma/1.25, in order to preserve this intentional contrast boost.

Finally, note that the response of real displays is actually more complex than can be described by a single number (display_gamma). If actual measurements of the monitor's light output as a function of voltage input are available, the third and fourth lines of the computation above may be replaced by a lookup in these measurements, to find the actual frame buffer value that most nearly gives the desired brightness.

Background color

Viewers which have a specific background against which to present the image will ignore the bKGD chunk, but viewers with no preset background color may choose to honor it. The background color will typically be used to fill unused screen space around the image, as well as any transparent pixels within the image. (Thus, bKGD is valid and useful even when the image does not use transparency.) If no bKGD chunk is present, the viewer must make its own decision about a suitable background color.

Alpha channel processing

In the most general case, the alpha channel can be used to composite a foreground image against a background image; the PNG file defines the foreground image and the transparency mask, but not the background image. Decoders are not required to support this most general case. It is expected that most will be able to support compositing against a single background color, however.

The equation for computing a composited pixel value is

  output := alpha * foreground + (1-alpha) * background
where alpha and the input and output sample values are expressed as fractions in the range 0 to 1. This computation should be performed with linear (non-gamma-corrected) sample values. For color images, the computation is done separately for R, G, and B samples.

The following code illustrates the general case of compositing a foreground image over a background image. It assumes that you have the original pixel data available for the background image, and that output is to a frame buffer for display. Other variants are possible; see the comments below the code. The code allows the bit depths and gamma values of foreground image, background image, and frame buffer/CRT to all be different. Don't assume they are the same without checking!

There are line numbers for referencing code in the comments below. Other than that, this is standard C.

01  int foreground[4];      /* file pixel: R, G, B, A */
02  int background[3];      /* file background color: R, G, B */
03  int fbpix[3];           /* frame buffer pixel */
04  int fg_maxpixval;       /* foreground max pixel */
05  int bg_maxpixval;       /* background max pixel */
06  int fb_maxpixval;       /* frame buffer max pixel */
07  int ialpha;
08  float alpha, compalpha;
09  float gamfg, linfg, gambg, linbg, comppix, gcvideo;

    /* Get max pixel value in files and frame buffer */
10  fg_maxpixval = (1 << fg_bit_depth) - 1;
11  bg_maxpixval = (1 << bg_bit_depth) - 1;
12  fb_maxpixval = (1 << frame_buffer_bit_depth) - 1;
    /*
     * Get integer version of alpha.
     * Check for opaque and transparent special cases;
     * no compositing needed if so.
     *
     * We show the whole gamma decode/correct process in
     * floating point, but it would more likely be done
     * with lookup tables.
     */
13  ialpha = foreground[3];
14  if (ialpha == 0) {
        /*
         * Foreground image is transparent here.
         * If the background image is already in the frame
         * buffer, there is nothing to do.
         */
15      ;
16  } else if (ialpha == fg_maxpixval) {
17      for (i = 0; i < 3; i++ {
18          gamfg = (float) foreground[i] / fg_maxpixval;
19          linfg = pow(gamfg, 1.0/fg_gamma);
20          comppix = linfg;
21          gcvideo = pow(comppix, 1.0/display_gamma);
22          fbpix[i] = (int) (gcvideo * fb_maxpixval + 0.5);
23      }
24  } else {
        /*
         * Compositing is necessary.
         * Get floating-point alpha and its complement.
         * Note: alpha is always linear; gamma does not
         * affect it.
         */
25      alpha = (float) ialpha / fg_maxpixval;
26      compalpha = 1.0 - alpha;

27      for (i = 0; i < 3; i++ {
            /*
             * Convert foreground and background to floating point,
             * then linearize (undo gamma encoding).
             */
28          gamfg = (float) foreground[i] / fg_maxpixval;
29          linfg = pow(gamfg, 1.0/fg_gamma);
30          gambg = (float) background[i] / bg_maxpixval;
31          linbg = pow(gambg, 1.0/bg_gamma);
            /*
             * Composite.
             */
32          comppix = linfg * alpha + linbg * compalpha;
            /*
             * Gamma correct for display.
             * Convert to integer frame buffer pixel.
             */
33          gcvideo = pow(comppix, 1.0/display_gamma);
34          fbpix[i] = (int) (gcvideo * fb_maxpixval + 0.5);
35      }
36  }
Variations:
  1. If output is to another PNG image file instead of a frame buffer, lines 21, 22, 33, and 34 should be changed to be something like:
            /*
             * Gamma encode for storage in output file.
             * Convert to integer pixel value.
             */
            gamout = pow(comppix, outfile_gamma);
            outpix[i] = (int) (gamout * out_maxpixval + 0.5);
    
    Also, it becomes necessary to process background pixels when alpha is zero, rather than just skipping pixels. Thus, line 15 must be replaced by copies of lines 18-22, but processing background instead of foreground pixel values.
  2. If the bit depth of the output file, foreground file, and background file are all the same, and the three gamma values also match, then the no-compositing code in lines 14-23 reduces to nothing more than copying pixel values from the input file to the output file if alpha is one, or copying pixel values from background to output file if alpha is zero. Since alpha is typically either zero or one for the vast majority of pixels in an image, this is a great savings. No gamma computations are needed for most pixels.
  3. When the bit depths and gamma values all match, it may appear attractive to skip the gamma decorrection and correction (lines 28-31, 33-34) and just perform line 32 using gamma-encoded sample values. Although this doesn't hurt image quality too badly, the time savings are small if alpha values of zero and one are special-cased as recommended here.
  4. If the original pixel values of the background image are no longer available, only processed frame buffer pixels left by display of the background image, then lines 30 and 31 must extract intensity from the frame buffer pixel values using code like:
            /*
             * Decode frame buffer value back into linear space.
             */
            gcvideo = (float) (fbpix[i] / fb_maxpixval);
            linbg = pow(gcvideo, display_gamma);
    
    However, some roundoff error can result, so it is better to have the original background pixels available if at all possible.
  5. Note that lines 18-22 are performing exactly the same gamma computation that is done when no alpha channel is present. So, if you handle the no-alpha case with a lookup table, you can use the same lookup table here. Lines 28-31 and 33-34 can also be done with lookup tables.
  6. Of course, everything here can be done in integer arithmetic. Just be careful to maintain sufficient precision all the way through.

Note: in floating point, no overflow or underflow checks are needed, because the input pixel values are guaranteed to be between 0 and 1, and compositing always yields a result that is in between the input values (inclusive). With integer arithmetic, some roundoff-error analysis might be needed to guarantee no overflow or underflow.

When displaying a PNG image with full alpha channel, it is important to be able to composite the image against some background, even if it's only black. Ignoring the alpha channel will cause PNG images that have been converted from an associated-alpha representation to look wrong. (Of course, if the alpha channel is a separate transparency mask, then ignoring alpha is a useful option: it allows the hidden parts of the image to be recovered.)

When dealing with PNG images that have a tRNS chunk, it is reasonable to assume that the transparency information is a mask rather than associated-alpha coverage data. In this case, it is an acceptable shortcut to interpret all nonzero alpha values as fully opaque (no background). This approach is simple to implement: transparent pixels are replaced by the background color, others are unchanged. A viewer with no particular background color preference may even choose to ignore the tRNS chunk; but if a bKGD chunk is provided, it is better to use the specified background color.

Progressive display

When receiving images over slow transmission links, decoders can improve perceived performance by displaying interlaced images progressively. This means that as each pass is received, an approximation to the complete image is displayed based on the data received so far. One simple yet pleasing effect can be obtained by expanding each received pixel to fill a rectangle covering the yet-to-be-transmitted pixel positions below and to the right of the received pixel. This process can be described by the following pseudocode:
Starting_Row [1..7] =  { 0, 0, 4, 0, 2, 0, 1 }
Starting_Col [1..7] =  { 0, 4, 0, 2, 0, 1, 0 }
Row_Increment [1..7] = { 8, 8, 8, 4, 4, 2, 2 }
Col_Increment [1..7] = { 8, 8, 4, 4, 2, 2, 1 }
Block_Height [1..7] =  { 8, 8, 4, 4, 2, 2, 1 }
Block_Width [1..7] =   { 8, 4, 4, 2, 2, 1, 1 }

pass := 1
while pass <= 7
begin
    row := Starting_Row[pass]

    while row < height
    begin
        col := Starting_Col[pass]

        while col < width
        begin
            visit (row, col,
                   min (Block_Height[pass], height - row),
                   min (Block_Width[pass], width - col))
            col := col + Col_Increment[pass]
        end
        row := row + Row_Increment[pass]
    end

    pass := pass + 1
end
Here, the function "visit(row,column,height,width)" obtains the next transmitted pixel and paints a rectangle of the specified height and width, whose upper-left corner is at the specified row and column, using the color indicated by the pixel. Note that row and column are measured from 0,0 at the upper left corner.

If the decoder is merging the received image with a background image, it may be more convenient just to paint the received pixel positions; that is, the "visit()" function sets only the pixel at the specified row and column, not the whole rectangle. This produces a "fade-in" effect as the new image gradually replaces the old. An advantage of this approach is that proper alpha or transparency processing can be done as each pixel is replaced. Painting a rectangle as described above will overwrite background-image pixels that may be needed later, if the pixels eventually received for those positions turn out to be wholly or partially transparent. Of course, this is only a problem if the background image is not stored anywhere offscreen.

Palette histogram usage

If the viewer is only short a few colors, it is usually adequate to drop the least-used colors from the palette. To reduce the number of colors substantially, it's best to choose entirely new representative colors, rather than trying to use a subset of the existing palette. This amounts to performing a new color quantization step; however, the existing palette and histogram can be used as the input data, thus avoiding a scan of the image data.

If no histogram chunk is provided, a decoder can of course develop its own, at the cost of an extra pass over the image data.

Text chunk processing

If practical, decoders should have a way to display to the user all tEXt and zTXt chunks found in the file. Even if the decoder does not recognize a particular text keyword, the user may well be able to understand it.

Decoders should be prepared to display text chunks which contain any number of printing characters between newline characters, even though encoders are encouraged to avoid creating lines in excess of 79 characters.

Back to PNG table of contents

Next page