Tuning the Imaging Pipeline

Tuning the Imaging Pipeline

This section briefly lists some ways in which you can improve pixel processing. Example 13-1 provides a code fragment that shows how to set the OpenGL state so that subsequent calls to glDrawPixels() or glCopyPixels() will be fast.

To improve performance in the imaging pipeline, follow these guidelines:

Byte-sized components, particularly unsigned byte components, are fast. Use pixel formats where each of the components (red, green, blue, alpha, luminance, or intensity) is 8 bits long.
Use fewer components, for example, use GL_LUMINANCE_ALPHA or GL_LUMINANCE.
Use color matrix and color mask to store four luminance values in the RGBA framebuffer. Use color matrix and color mask to work with one component at a time. If one component is being processed, convolution is much more efficient.
The following code fragment uses the green component as the data source and writes the result of the operation into some (possibly all) of the other components:

/* Matrix is in column major order */
GLfloat smearGreenMat[16] = {
0, 0, 0, 0,
1, 1, 1, 1,
0, 0, 0, 0,
0, 0, 0, 0,
};
/* The variables update R/G/B/A indicate whether the
* corresponding component would be updated.
*/
GLboolean updateR, updateG, updateB, updateA;

...

/* Check for availability of the color matrix extension */

/* Set proper color matrix and mask */
glMatrixMode(GL_COLOR);
glLoadMatrixf(smearGreenMat);
glColorMask(updateR, updateG, updateB, updateA);

/* Perform the imaging operation */
glEnable(GL_SEPARABLE_2D_EXT);
glCopyTexSubImage2DEXT(...);
/* Restore an identity color matrix. Not needed when the same
* smear operation is to used over and over
*/
glLoadIdentity();

/* Restore previous matrix mode (assuming it is modelview) */
glMatrixMode(GL_MODELVIEW);
...
Load the identity matrix into the color matrix to turn it off.
When using the color matrix to broadcast one component into all others, avoid manipulating the color matrix with transformation calls such as glRotate(). Instead, load the matrix explicitly using glLoadMatrix().
Know where the bottleneck is.
Similar to polygon drawing, there can be a pixel-drawing bottleneck due to overload in host bandwidth, processing, or rasterizing. When all modes are off, the path is most likely limited by host bandwidth, and a wise choice of host pixel format and type pays off tremendously. This is also why byte components are sometimes faster. For example, use packed pixel format GL_RGB5_A1_EXT to load texture with an GL_RGB5_A1_EXT internal format.

When either many processing modes or a several expensive modes such as convolution are on, the processing stage is the bottleneck. Such cases benefit from one-component processing, which is much faster than multicomponent processing.

Zooming up pixels may create a raster bottleneck.
For simple loading, turn all pixel modes off. Turn depth test and texture off, when possible.
A big pixel rectangle has a higher throughput (that is, pixels per second) than a small rectangle. Because the imaging pipeline is tuned to trade off a relatively large setup time with a high throughput pixel transfer efficiency, a large rectangle amortizes the setup cost over many pixels, resulting in higher throughput.
Having no mode changes between pixel operations results in higher throughput. New high-end hardware detects pixel mode changes between pixel operations: when there is no mode change between pixel operations, the setup operation is drastically reduced. This is done to optimize for image tiling where an image is painted on the screen by drawing many small tiles.