CICA 1993 August

home *** CD-ROM | disk | FTP | other *** search

/ CICA 1993 August / CICA.cdr / win_nt / multimed / technote / graphx.zrt / GRAPHX.TXT

Wrap

Text File | 1992-03-27 | 92.4 KB | 2,002 lines

Multimedia Technical Note Graphics Design and Optimization Matt Saettler Multimedia Technical Support One Microsoft Way Redmond, WA 98052-6399 USA 03/92 Revision 1.00 Information in this document is subject to change without notice and does not represent a commitment on the part of Microsoft Corporation. The software described in this document is furnished under license agreement or nondisclosure agreement. The software may be used or copied only in the accordance with the terms of the agreement. It is against the law to copy the software on any medium except as specifically allowed in the license or nondisclosure agreement. No part of this document may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying and recording, for any purpose without the express written permission of Microsoft Corporation. This technical note is for informational purposes only. Microsoft makes no warranties, expressed or implied in this technical note. Microsoft, MS, MS-DOS, XENIX and the Microsoft logo are registered trademarks and Windows is a trademark of Microsoft Corporation. Other trade names mentioned herein are trademarks of their respective manufacturers. Copyright 1992, Microsoft Corporation. All Rights Reserved. Table of Contents Overview 3 Uses for Graphics 5 Life Cycle of a Graphic 7 Pictures 9 Vector Graphics 11 User Interface Graphics 13 Cast Graphics 16 Tile Graphics 20 Video 23 Simulation Graphics 26 Special Effects 28 Graphics Creation, Archival and Conversion 29 Graphics Loading and Storage 31 Graphics Display 35 Glossary 38 Overview This technical note tries to present a very important but complicated multimedia issue: graphics. GUIs are only just starting to make use of graphics, and multimedia titles use all the capabilities of a graphics system. This has presented a new set of problems as applications deisgners and developers try to deal with placing graphics into their applications. This technical note discusses the design and optimization issues under the Windows environment. Chapter one describes the uses for graphics. This allows the definition of different graphic types based on their intended usage. Chapter two examples the life cycle of a graphic and presents some special issues that only occur at certain phases of this cycle. Several chapters are devoted to the different types of graphics. Each of these chapters describes the graphic type in detail and discusses optimization techniques that are specific to that type. These are followd by chapters detailing the different phases of the life cycle and design and optimization issues that are common between the different types. Finally, a glossary provides definitions of the many terms used in this technical note and common to multimedia with Windows design and development. Obtaining the Current Version Graphics design and optimization are a very complicated subjects and this document will be expanded in the future to cover these topics in more detail. If you have specific issues that you would like to see covered, or if you would like sample code to cover any of the topics discussed in this document, please let us know, contact Multimedia Technical support at the address listed on the cover. The current version of this technical note can be obtained from the Microsoft Multimedia Systems BBS (MM BBS) at (206) 936-4082 in the files library in the technotes section in the graphx.zip file. BBS modem settings are 9600 baud, no parity, 8 data bits, 1 stop bit. This document is also available from the WINSDK forum on CompuServe. Obtaining the Sample Code Sample code discussed in this technical note can be obtained from the Microsoft Multimedia Systems BBS (MM BBS) at (206) 936-4082 in the files library in the samples section. BBS modem settings are 9600 baud, no parity, 8 data bits, 1 stop bit. They are also available from the WINSDK forum on CompuServe, but newer versions will be available first on the MM BBS due to Compuserve file verification delay times. Intended Audience This document should be read by Multimedia Producers (as defined in the Multimedia Authoring Guide) as well as programmers using all types of tools. You should read this document after reading the Multimedia Authoring Guide book of the MDK documentation. This book is also available directly from Microsoft Press and other bookstores. Microsoft Press can be reached at 800-MSPRESS. Printed Images This document contains many printed images. The printer dithers the color images to monochrome when it prints, resulting in an image that may not have solid colors when it is printed. Since the issue of solid color is important in this document, please be aware of this situation. Questions? If you have questions, requests, or problems with this technical note, you should send to one of the following: 1. via email on the MM BBS, addressed to the sysop, 2. via FAX or mail to: Matt Saettler Multimedia Technical Support Microsoft Multimedia Group One Microsoft Way Redmond, WA 98052-6399 FAX: (206) 93MS-FAX We are very interested in your feedback on the information contained in this technical note. Uses for Graphics This section discusses the different uses for graphics objects and therefore the different types or categories. These areas are discussed in greater detail later in this technical note. Pictures This is the simplest form of graphics. Pictures can be thought of as single-frame video. In large quantities, pictures can also take up a lot of disc and memory space, so compression might be important. Compression might also be needed for optimal file I/O. This is discussed later in this technical note in the section entitled Graphics Storage and Loading Optimization. Vector Graphics Vector graphics are a special type of picture. A vector image consists of instructions about how to create the image rather than the actual pixels that represent the image. To display a vector graphic, an application (or system) must interpret the instructions to create the image. Vector graphics are generally limited to computer images. User Interface Graphics User interface graphics are graphics that are used to interact with the user. Windows provides a standard graphical interface. Many applications only use the standard interface items. However, many more options are available to you Buttons can have pictures instead of just text -- for example, the Toolbar in Microsoft products, the "Exit" button in the Windows setup program. These buttons greatly increase the usability of an application. Pictures can also be used as part of the interface. For example, a picture of a house floor plan might be used for control of security, lights, etc. Another example is the use of SHED graphics in Microsoft Viewer, which is included with the Multimedia Development Kit (MDK). Another option you have is using simple animations. For example, you can create animated icons that show a particular operation such as double spacing, drawing, etc. Simulation Graphics Simulation graphics are graphics calculated as they are displayed. This is the most CPU intensive of all uses for a particular graphic. The raw blit speed of the display is very important because the whole display is usually updated each frame. In all current video cards, writting to regular (CPU) memory is much faster than video memory. Some MS-DOS applications use the optimization of double-buffering the display. However, this technique is not available under Windows because most of the video modes do not support this operation. Frame rate is usually the metric used to determine quality, but image quality is also important (quality usually means complexity, but not always). Microsoft Flight Simulator is an example of an application that uses primarily simulation graphics. Cast Graphics Cast graphics are images that are constructed from many individual images. Cast graphics are also called sprite graphics. This is the most commonly used form of graphics in animations and are also very useful for still images. The Living Books series from Brderbund use primarily cast graphics. Tile Graphics Tile graphics are pictures drawn on a square grid. This allows easy emulation of board games and other game systems. SimCity from Maxis is an example of an application that uses primarily tile graphics. Tile graphics can be thought of as a constrained (to a grid) sprite. Video Video is inherently an animation form of graphics. The image is, by definition, precalculated. The playback is usually non-interactive, meaning that the user does not interact with the video while it is being shown. Video graphics take up a lot of room and are almost always distributed on a CD-ROM. To be effective when played from CD-ROM, compression optimization is needed. Animator Player for Windows from Autodesk is a Windows tool that allows display of video graphics. Audio Video Interleaved (AVI) from Microsoft and Digital Video Interleaved (DVI) from Intel also provide video graphics. The new Digital Video MCI Command Set provides MCI commands for this type of graphical data. Combinations In a good multimedia application, the different types of media, including the different type of graphics, are combined to produce a high quality presenation to the user. In fact, it is hard to find an application that only uses one of the graphic types described in this technical note. There are many ways that each of these types of graphics can be combined. For example, a background video can be used for displaying moving sprites, or an application might show a TV-like video as part of its user interface. Life Cycle of a Graphic This section discusses the different uses of graphics in an application and defines some common terms used throughout this technical note. You should read the Multimedia Authoring Guide book of the MDK documentation to gain an understanding of a Multimedia Project. This book is also available directly from Microsoft Press and other bookstores. Microsoft Press can be reached at 800-MSPRESS. Creation Most computer graphics are created by artists. Drawn images are usually scanned using scanner hardware. The image is then saved (archived) in a file. Video is digitized from tape or directly from a camera using a capture board, then it is also saved in a file. Some programs create the graphics they use just before they use them. These programs are usually programs that use simulation graphics, or commonly called simulation programs. A common example of these programs is a flight simulator. Note that with a simulation program some of the graphics, for instance the cockpit of a plane, are actually precalculated pictures which are created for distribution with the application. How an image is created digitally in the computer greatly effects its quality. For example, an image scanned at 8 bits/pixel looks very poor in quality compared to an image scanned at 24 bits/pixel and then reduced. Application Conversion Before an image can be used in a specific application, it must be converted and modified to fit the needs of the application. For instance, most applications do not require 24-bit graphic data. Some applications might require the image to be converted to 8-bit color depth. Distribution At some point, the image is hooked into an application. This might mean creating hypermedia links using an authoring tool, or encoding the name of the image file into the source code. Images can also be compressed or combined into single, large files to provide easier access and prevent the user from illegally copying the individual image. For Multimedia, the media used for distribution is generally a CD-ROM because of the size of the data. If floppy disks are used, then installation on the user's hard drive is usually required. This is inconvenient for the user because time must be spent to install the data, and the user's system must have enough space available. Loading and Storage Once an application determines that it needs a particular image, it must first load it. This step can also involve decompression or other modifications to the image for the display system that the application is using. An application might also delay decompression/conversion until it actually needs to display the image. Display Now the application draws the image. How the image is stored in memory (and on disc) can greatly effect the time the system takes to display the image. Native formats (for example, DIB for Windows) are much faster at displaying images than a non-native format because the application must convert the image before displaying it. There are two measures of time for the user: How long it takes the system to draw an image How long the user thinks it actually took to draw an image An application can be slower when measured by a watch, but appear to be faster and more responsive to the user because of the way the application reacts to the user. For example, if your screen displays text and graphics, you should display the text as quickly as possible and then work on displaying the graphic. This will give the user time to start reading the text, taking attention away from the delay from the loading and displaying of the graphic. Archival Storage In this stage of an image's life it is just saved for future use. The same image can be used several times in different projects. For example, it might be used as a background in one application, and a picture snapshot in another. It is important to have a good system to keep track of available images for use in future projects because it is cheaper to re-use an image that it is to create a new one. Pictures The simplest form of graphic is a picture. Pictures can be thought of as videos that don't move; they are just single frames of a video. Images can be divided into the two types: natural images and computer images. The following is an example of a natural image. There are many color changes in the image. This is fairly complex picture because there are very few areas where there are not color changes. Below are examples of computer images. There are large areas of solid colors (even though they print dithered). There are also very few color changes in these image. These are the low end of computer images (the image can be more complex than these examples). How Picture Graphics Work Pictures, like all items in a GUI, are displayed in response to a user action. The image is read from media into memory, optionally decompressed, and then copied to the screen. The image can be kept in memory to optimize response time for display. Special effects can also be used when copying the image to the screen; see the section entitled Special Effects, later in this technical note. Pictures are sometimes called raster graphics because they consist of a sequence of pixels meant to be displayed only on a raster display. Advantages Compared to other graphic types, pictures are much smaller because there is no animation information to store. Usually, images are displayed as part of a UI. These same images are displayed as the main graphical item, except in presentations and applications where the graphical image is the content (a collection of famous paintings, for example). Single pictures are much smaller in storage and memory size than a video or other animation of the same display size. A single 640x480x256 image takes a little over 300K. A 15 second, 15fps vide animation of the same size would take almost 7MB! This means that you can provide many more still images on a CD. Pictures are usually displayed either as the basis of a user interface or as the result of a user interface action. Optimizations Pictures are a relatively simple type of graphic. Most of the optimization that applies to other graphic types applies to pictures also. It is important to know the different types of images, natural and computer, and how they will affect optimization. Vector Graphics Vector graphics can be thought of as a specialized compression technique for pictures. Instead of storing the actual pixels of the image, vector graphics allows the storing of instructions about how to draw the image. For simple images, this format is much smaller in size than any other compression method. However, if the image is complex, this method can actually produce vector images that are much larger. For this reason, vector graphics only work well for computer images. In addition, Vector graphics are much slower to draw because all the instructions must be followed before the image can be displayed. The following is an example of a vector graphic, though moderately complex. This graphic can easily be scaled with little loss in image quality. Note that the background is a solid color and the words are part of the image. In a real application, you would want to leave the words out of the image so that you can make your application available in different languages. How Vector Graphics Work The image format encodes instructions about how to draw the image. For example, a white rectangle could have the following instructions: Set Brush to White Draw Filled Rectangle from 120,100 to 150,140 The application must read and interpret the instructions to create the image to be displayed. For complex images (with many instructions), this can be a very long process. Windows defines a format for vector graphics called Windows Meta File Format (WMF). This file format is defined in the Windows SDK documentation in the File Formats chapter of the Programmers Reference. The instructions in a WMF are the GDI instructions, so creation and playback of the instructions is very easy under Windows. PostScript is another example of a vector graphic format. Encapsulated PostScript (EPS) is a file format for storing compressed PostScript commands. A number of tools work with this format. Advantages The main advantage of using vector graphics is their size. A simple image can be compressed to a very small size. In addition, vector images can easily be scaled in size without losing clarity because the instructions are easily modified for the new size. Optimizations Vector graphics are a relatively simple graphic type. They become very inefficient, however, when the image becomes complex. You should try to keep the images simple and the speed should remain acceptable. Vector graphics are only applicable to computer images, but not natural image because of their complexity. Creation of the images can also be an issue because vector graphics must be created by a computer artist using such tools as CorelDraw from Corel or AutoCAD from Autodesk. This means that a lot of time (and expense) is needed to create these images; they cannot just be digitized like a natural image. User Interface Graphics Every Windows application has a user interface. The user interface is the main way the user interacts with the application. Presenting a pleasing and easy-to-use interface should be the goal of every application. Multimedia allows the enhancement of the standard user interface in the graphics area. User interface can be greatly improved by combining sounds with user interface so, for example, a sound is made when the user selects an option. This provides additional feedback to user about his action. This section discusses how graphics can be used to enhance an application's user interface. How it Works A picture is displayed that contains defined areas or hot-spots for the user to choose. When the user selects these areas, the applications detects this selection and performs the desired operation. Buttons and Hot Spots When a user clicks a mouse button in Windows, the button or hot spot appears depressed and the following happens: If the cursor is moved off of the button or hot spot, the image is "un-depressed" If the user releases the mouse button (WM_xBUTTONUP) when the button or hot spot is depressed, then the related action takes place If the user releases the moust button (WM_xBUTTONUP) when the button or hot spot is un-depressed, then no action takes place These actions apply even if the user moves the mouse out of the current window. Therefore you should use SetCapture when the user clicks the button. To determine if the mouse pointer user is over the button or hot spot, use PtInRect. The following is an image from the sample Viewer application in the MDK. It shows a picture of the United States and has some defined buttons (hot-spots) for certain states. Note that the buttons are not only identified by position, but also have the state abbreviation for easy identification if the state location is not known. A button does not need to be rectangular in shape. For non-rectangular buttons or hot spot shapes, you should use multiple rectangles to represent the area of coverage. For complex shapes, several rectangles may need to be used to closely represent the shape of the area. You will need to trade off accurately representing the shape with the number of rectangles required. The fewer rectangles, the faster the response time to the user selection. User Interface Creation Issues Buttons should be created from one image. The program should calculate the depressed and un-depressed bitmaps as they are needed. The program should also draw the 3D borders on the button. When using a background image as the UI, you should keep the background image separate from the buttons that are placed on it. This allows greater flexibility in design. The non-optimal alternative to this first medhot is to encode the button images directly onto the background image. However, this means that the buttons cannot be rearranged or modified without re-authoring the background image. A background image can be very large (640x480x256 is 300K). You should keep images compressed (DIB RLE works very well for most images). See the section Graphics Loading and Storage Issues in this technical note. References A good source of UI information is the Common User Access (CUA) Guide. This guide is included with the Windows SDK and is also available from IBM. It provides UI details on common interface items and provides a good framework for designing interface items not specifically described. The new Windows 3.1 SDK will also include a new Style Guide that will contain more information than is available in the CUA guide. General Guidelines Consistency is an essential part of good user interface design. Be consistent in your layout, design, sizing, etc. You should also strive for consistency across applications (not just your own application) and with the rest of the system that the user sees. You should also make buttons obvious to the user (unless that's part of the game). Changing the cursor is one good way to show to the user a hot spot is located. Because Windows provides device independence., you should be sure your UI code and design can handle the different devices that user might have. You should watch the following areas: 1. Color type: monochrome, color. 2. Color depth (4bit, 8 bit, 24bit, etc.) 3. Screen resolution (320x200, 640x480, 800x600, 1024x768, 1280x1024) 4. Palette or no palette 5. Transparent blit options Cast Graphics Cast graphics is a set of images that can be placed onto a display in specified locations. The location of the image can vary over time (thus the image appears to move). A more complex system will allow the actual graphic to change over a period of time so that it is also animated. For instance, a ball can spin, a bird can flap its wings, etc., combined with movement of the graphic, creating an excellent animation. A member of a cast can be thought of as a character. The word character is what the graphic usually represents: it is a person or plane or other graphic object displayed on screen. Cast members are also called sprites. The technique of cast-based animation is also called sprite animation. Many graphics programs use cast animation for creating animations. Action! from Macromind and Animation Works from Gold Disk both are good examples of tools that use cast-based graphics. How it Works Cast animation depends on an application's ability to combine sprites onto a single image (called compositing). This composit image is also dependent on using transparent blit operations. Usually the final image (the frame) is first created as an off-screen bitmap and then copied to the display. However, the composite image can also be created directly on screen. Sprites are drawn in a particular order called Z-order. The "Z" refers to the third dimension (along with X and Y) and defines the position of the sprite front-to-back. For example, a tree in an image is in front of the clouds, and an image of a flying bird is to be drawn between the clouds and the tree. The clouds would be drawn, then the bird, then the tree. The resulting image is then displayed on the screen and the process starts for the next frame. Sprites can consist of a sequence of images that define motion for the sprite. For example, consider an image of a dog walking, or a bird flying. On each successive frame, the next image is used for the sprite so that the sprite appears to be in motion. When the last image is reached, the first image is re-used resulting is a looping animation. The images usually have a transparent area to allow compositing onto a larger image. The transparent area can be defined by a particular color in the image, or by a monochrome mask. The mask uses up extra memory, but can be useful. 'Losing' a color for a transparency indicator is usually not a problem with graphics with a color depth of 256 colors. Some video modes on some cards support a technique called page- flipping. This involves drawing to one video area while the other is being displayed and then, with a simple command to the video card, switch to the new video area so that it is displayed. This technique allows much cleaner animations because the final image is not drawn to the video area that is currently being displayed and therefore avoids blit tear (see glossary). Windows does not support page-flipping because most of the video cards and modes it supports do not allow this technique. The following is a sequence of six frames that show a animation of a dog walking. Note that this is a perfect case for a looping animation as walking is a cyclical action. However. note that the dog in these pictures moves in the image: for a real sprite, the image would stay in the same place in the images so that motion can be simulated by changing the placement of the graphic itself. This sequence of images is from the RLEAPP.ZIP sample code. See the Overview section for information about obtaining this sample code. The images are set up to use a color transparency; the background is a solid color (even though it is dithered when it prints) so that it can easily be transparent when the image is displayed. This will allow this image to be placed on top of another image with the image below showing through. For example, the background would show through the dogs legs instead of the image of the dog being a complete rectangle (as shown below) with a dog in it surrounded by the background image. Advantages The main advantage of cast-based animation is the ability to be interactive. How and when a sprite moves can be in direct response to a user action (a mouse click, etc.). The Living Books series from Brderbund is an excellent example of using uses this technique in an application. Contrast this flexibility to what is needed with a frame- based animation, discussed in the video chapter later in this technical note. This technique allows moderetly complex animations with limited disc space. A long animation can be created just by moving a sprite back and forth across the screen allowing greater realism without major CPU usage. Video animations can give more complex images, but are not as suitable for interactivity. Types of Sprites The main requirement for performing cast based animations is that the image be transparent so that it can be composited on the frame. Usually, bitmaps are used for the images, although fonts can also be used. In Windows, fonts have a defined transparent region because they are monochrome images. Only the part of the image that is black is drawn and the part that is white is considered to be transparent. This results in very fast display, but fonts are limited to one color (it doesn't have to appear as black). The image of the sprite can be changed by chosing which character to use from the specified font. Font data must be included with the application for the animation to work correctly. This is very a very complicated method that is only uesful when memory is a major consideration. Flat Worlds For a flat world, the view is usually top-down, or a bird's eye view. Examples of this type of application include most arcade games, strategy games, and presentation-oriented applications. Most arcade games are sprite based, with the game elements (from bullets to the enemy) being sprites. However, arcade games also have dedicated hardware that allows sprites to be composited quickly directly on the display. In a flat world, sprites move in a plane that is parallel to the screen; that is, left, right, up and down. The sprites don't move in the Z plane. Moving the background image allows for the simulation of movement in a larger area than the screen allows. In addition, some games have multiple background images to further enhance the illusion of movement. For example, a background image of hills, then a foreground image of trees, both scrolling by (but at different speeds), gives a much better illusion of motion. Many cartoons on TV use this technique in its analog form. How the image moves varies between different applications. In some games, the background image only moves horizontally across the screen. The player's sprite usually moves from left to right (actually, the background image moves from right to left) and encounters other creatures and objects as they travel. In some games, the background travels vertically across the display. The player's sprite is usually at the bottom of the display and the enemy usually comes into range at the top of the display. In more complex applications, the player's sprite is allowed free movement while the background moves to allow the player's sprite to recenter itself in the display. In very complex applications, the background image can actually be a video (a sequence of precalculated- calculated images) with the interactive sprites drawn on top of a very complicated background. 3D Worlds 3D worlds differ from flat worlds mostly in the perspective in which the images are displayed. The images are displayed with a "near and a "far" perspective with "near" being at the bottom and "far" being towards the horizon which is usually about in the middle or top 1/3 of the display. In 3D worlds, sprites can move in a plane that is not parallel to the screen. The character can also move up and down in the plane. These sprites can move closer or get farther from the user and generally change size as they move. This type of application takes advantage of multiple image planes. For example, trees in the foreground might be in front of (in Z-order) the player's character. In addition, motion or position information is often used to determine where the player's sprite is allowed to move. Sierra Online's games are a very good example of this type of application. Optimizations Creation of the image is the most time-consuming operation. There are two areas that take time: loading the sprite image in memory, and the actual blit itself. You should keep all the sprite images in memory, if possible. It is more efficient to use a single bitmap with different versions of the sprite than it is to use a separate bitmap for each. The different versions should be arranged on a grid in the single bitmap so that it is easy to change and modify images when necessary. For blitting, you should use transparent operations. If necessary, use a DIB bitmap and write your own optimized drawing routines to draw onto the bitmap. Examples of drawing directly into DIBs with custom routines is demonstrated in the TRIQ.ZIP sample code. See the Overview section, earlier in this technical note, for information about obtaining this sample code. In addition, compression can also help reduce some operations (loading, storage, blitting, etc.). See the Graphics Loading and Storage Issues section, later in this technical note, for a discussion on compression. Dirty area optimizaions are much easier for cast-based animations than they are with any other form of animation. Because every movement is known to the application it is easy for the application to only redraw the areas that contained movement. Remember that the old location of the sprite as well as the new area must be updated on the screen. Tile Graphics This type of application allows for a very complex display using very few graphics. SimCity from Maxis is a good example of tile graphics. Civilization from MicroProse also uses this technique. Many role- playing games also use this technique to represent cities, terrain, etc. This type of graphic allows for easy representation of a series of virtual "places" or "tiles". For example, in SimCity, each tile on the screen represents a city block. Each tile can change based on the changes that take place in that virtual place. For example, when road is built in a tile in Civilization, the image of the tile changes to include an image of a road. Another powerful use of tiles is derived from board games where armies and other items are represented by squares of cardboard printed with images (called tiles, which is where the name comes from). This allows the representation of a movable object that is currently "in" the virtual place represented by the tile. For example, in Civilization, a chariot is represented by a picture of a chariot that is "in" (covering in Z-order) the terrain image for the tile. When the chariot is moved, the original tile is redrawn and the new tile is drawn with the chariot on top. Tiles are generally represented as squares, but are sometimes organized in a hex (six-sided) shape. This shape allows uniform placement on the display and adds better distance representation (but it is more complex to display). Hexes are usually only used for strategy games. Another implementation uses diamond shaped tiles with 3D perspective. This technique is used in A-Train from Maxis. Images can also be animated using the same technique as sprites: multiple images. SimCity uses this technique to make factories belch smoke, cars move, etc. How Tile Graphics Work Usually the application works on a grid with each entry (row, column) represented by a tile. Each tile has specific attributes that effect how it is to be displayed. For instance, the terrain of a tile could consist of water, plains, mountains, etc. In addition, the tile can have roads, railroads, cities, factories, and houses. Each attribute has an associated image so that it can be represented to the user. For example, in Civilization when a city is built on a tile, an image of a city is drawn. The images are drawn in a specific order (Z-order). For example, base terrain (hills, plain, etc.), then roads, then units that are "in" the tile. One major trick you can use for optimization is combining attribute images into one big bitmap. This prevents many small bitmaps from being created, which wastes memory. Instead, the images are arranged on a grid (since they are all of similar size) and can be easily retrieved from the source bitmap image. Another trick that is used often is replaceable images. SimCity uses this technique to change the representation of the tiles from modern times to medieval to futuristic times . It is the same code for each display, just the images used have changed. Tile images can be thought of as limited-motion (but still animated) sprites. While transparent capabilities are important for sprites so that they can have non-rectangular shapes, transparent images are also very important for tiles so that they can be stacked. Advantages The main advantage of tile-based animations is that they can easily represent a virtual world to the user. For example, blocks in SimCity appear as blocks on the screen, and the cities in Civilization are dependent on the squares around them for food and trade. Flat Worlds This is a top-down display of the tiles. Z-order between tiles is not a factor because they are all on the same "plane". However, Z-order within a tile is still important. Often the individual images are drawn in fake-3D to give the illusion of perspective. SimCity uses this technique in its images. Dirty area optimizations are very useful for this type of graphic. The following is an example of a normal grid for a tile-based graphic. Images would be drawn into each square and then the entire image displayed on the screen. 3D Worlds This is a view where the tiles themselves have a Z-order (usually front to back). Distant tiles are at the top, and the near tiles are at the bottom to maintain the illusion of perspective. Note that the images now have an actual appearance of height and can obscure tiles that are farther away (above). Because this technique allows large images to be "in" a tile, the compositing time can be very long. Note how long it takes Civilization to draw the city view. However, the resulting images can be very good. The following is an example of one way to arrange the squares to provide a perspective image. The vanishing point is to the upper right. For example, the city view in Civilization displays the "attributes" (the population, buildings, etc.) of the city on a tile basis that is drawn back-to-front. This technique allows the images in the near tiles to cover the images in far tiles because they are drawn later. For example, in Civilization in the city view, the city walls are drawn almost last, allowing them to cover the previous tile row of buildings in the city. Optimizations Since tiles are just a subset of sprites, all the optimizations discussed in that section also apply. If perspective is used, the height of the tile must be taken into account when it needs to be modifed. In addition, the height of the neighboring tiles will need to be checked to make sure that they don't need to be redrawn to recreate the updated image. Video Video is defined in this technical note as the continuous playback of a precalculated animation. This is usually referred to as frame-based animation. This method is usually contrasted with cast-based animation. Within video animations, there are two major types of graphics: natural images and computer images. Natural images are normally considered "video"; images of real events stored in digital form. Computer graphics are defined as simple, cartoon-like images, but can also be used as video. The main difference between these two types of graphics is the complexity of the image. Natural images are very complex in both number of colors and changes in colors. On the other hand, computer graphics usually use few colors and don't change color very often. Some computer-generated images are classified as natural images because of their use of color. For example, most images generated by 3D Studio from Autodesk are natural images even though they are generated solely by computer. Video is usually used in conjunction with audio, either for a sound track or for the actual audio for the recorded images. For example, in SimCity a video animation of a newscaster could tell the user how the mayor is doing . Videos differ from other graphics types, such as animated sprites, because they are so large that they cannot be loaded (and therefore played) from memory. They must be streamed from a hard disk or CD-ROM disc. Data that is streamed is read at a given rate, usually as fast as the media transfer rate will allow. In addition, sprite animations are usually a small number of frames that are repeated to provided continuous movement. Of course, that is why they are small in size. The following factors effect the quality of a Video animation: Pixel depth Compression method Image (frame) size on screen Image (frame) size on disc Frame rate (frequency) The following is an example of a sequence of natural images showing the movement of an elephant. Note that these could also be used as a sprite because the images are small and they loop easily. How Video Graphics Work Video animation is accomplished by displaying a sequence of images on the screen. Each image is called a frame, and the number of frames per second (Hertz or Hz) is called the frame rate. A frame rate of 15Hz is considered minimum to show motion. 8 bit color depth is considered minimum to show natural images. Image size on the screen should be large enough for the user to understand the video (but this needs to be balanced with the transfer rate of the media and the frame rate and image size desired). An 8-bit image at 160x120 pixels (somewhat of a standard because it is exactly 1/4 of 640x480) is about 20K per image. At 15 frames/second, this is about 300K for just the images not including audio. Because video is, by definition, predetermined data, it is well suited for playing from a CD-ROM directly. The main limitation for playback is the 150K/second transfer rate of the CD-ROM (if it meets MPC standards). The main advantage is the cost of a CD-ROM disc compared to the equivalent number of floppy disks. The MPC specification specifically limits the time the CD-ROM can take to transfer the 150K/second data to 40% of the CPU bandwidth. The rest of the bandwidth is then available for the application to display the images and play the waveform data. Most CD-ROMs that are not MPC compatible don't meet the 40% requirement. In fact, most take 100% of the CPU, leaving no time to actually do anything with the data! The process is really very simple in theory; just take a sequence of images and blit them to the screen at the appropriate time. In practice, however, many optimizations are needed to make videos actually work. Microsoft's Audio Video Interleaved (AVI) system provides tools and playback for software-only videos. This system will be in Beta by early Q2, 1992. Providing the user with limited interaction with a video can create very good effects. For example, the application Guest from Trilobyte takes the user through a haunted house. The user interacts with the picture and the result is a video showing movement within the house with images generated by 3D Studio from Autodesk. The user clicks on an area in the image (such as a door, staircase, etc.) and a video is shown that visually presents to the user the results of the action. This is a good example of combining multiple types of graphics (video, pictures and UI) into a multimedia application. Advantages The main advantage of video animations is that they can be much more complex because the images are calculated when the file is created. For example, you can use 3D Studio from Autodesk to create a very complex 3D animation that takes hours to calculate, but it can be played back very quickly. Contrast this capability to what is possible using simulations or cast-based animations. However, video does use up a lot of disc space. Delta-frame compression can help you reduce this, but the total space can still be large. At 150K/second of data, a 30-second video is over 4MB of data. This means that CD-ROM is vital for the economical distribution of digital video data. Video playback from CD-ROM is limited by the following: 1. 150K/second from CD 2. Time for decompression algorithm 3. Blit speed of video 4. Time for interface operation (mouse movement, etc.) 5. Time for other operations (waveform, MIDI, etc.) Optimizations Optimizations for video can be broken into two areas: compression and interleaving. Compression Because of the size of the data, and the limitations of transfer rates, the images must be compressed to get an acceptable video. Since the video data is precalculated, a lot of time can be spent choosing the optimal compression algorithm for the specific image data being used. Interleaving Data Becauise both the image and video data must be played together, it makes sense to place them both in the same file. This process is called interleaving. Interleaving's main advantage is that is avoids the slow seek times present in many CD-ROM drives (the MPC specification only requires an average of 1 second). The data is placed in a file in blocks (RIFF chunks work very well) so that the data is read just before it is needed. Simulation Graphics Simulation graphic images are calculated just before they are drawn. This is the most CPU-intensive form of graphics. Examples of applications that primarily use simulation graphics are Microsoft Flight Simulator and WingCommander from Origin. A simulation graphics program is representing (or simulating) a situation to the user. In a flight simulator, the program must draw all the objects that the user would see from the cockpit. To provide a realistic display (usually the goal of the simulation applications), the display must be updated at least 7 frames per second. This means that every operation must be carefully optimized. How Simulations Graphics Work This section explains only the general effects of simulations. There are many good books available on calculating images based on simulations, but they are all very technical in scope. In general, application realism, complexity are driven by: 1. Object complexity 2. Light source 3. Object representation Object complexity directly affects the appearance of objects in the simulated world. Implementing a light source and shading the objects accordingly, also increases the complexity of the calculations necessary to draw the frame. Objects can be represented by anything from a simple wire frame (a series of lines), to filled polygons, to an actual perspective representation. Old flight simulators used wire frame images. The current version of Microsoft Flight Simulator uses filled polygons. WingCommander from Origin uses precalculated, computer generated images to represent the objects in the world. Because it is very difficult for the program to remember the different positions of objects between frames, simulation graphics cannot take advantage of the dirty area optimization. This means that the application must redraw the complete frame each time. Because of the dependence on blit speed, most simulations don't yet run under Windows. This is because the screen pixel resolution affects blit speed. A 640x480x256 display has more than four times the number of pixels as a 320x200x256 display. This means that, everything else being equal, a full-screen blit should take four times longer to draw on the higher resolution display. Most current simulators don't want to give up that extra time and prefer to provide better quality simulations. Advantages The main advantage of simulations is that they can provide the most 'realism' to the user; an airplane moves like an airplane, the car drives like a car, etc. Because of the competition among simulation applications, ISVs generally push the capabilities of personal computers to their limit. The result is that they are not concerned with device independence and other enhancements provided by Windows; they are generally happy in the MS-DOS world. This will change as the hardware capabilities of machines running Windows improves and as the memory requirements and complexity of applications increases, making the protected mode environment provided by Windows attractive. Optimizations If the video blit speed is slow and the delta-frame algorithm is fast, it might be faster to generate a delta-frame instead of generating the whole frame. However, this will break down when the time to calculate and draw the delta-frame is greater than the time to draw the entire screen. Since simulations depend on blit speed, you should use DIB bitmaps at DIB_PAL_COLORS. You should take advantage of the memory Windows gives you (it's protect mode!). Since MPCs are 386 minimum, you can write 32- bit code, which is much faster than regular 16-bit code. Examples of drawing directly into DIBs with custom routines and using 32-bit code are demonstrated in the TRIQ.ZIP sample code. See the Overview section for information about obtaining this sample code. Special Effects Special effects can be used when displaying or moving images to provide variety to the user. You can use a couple of techniques to create special effects with graphics. Palette Effects Palette effects provide a fast and effecient way to modify the image on the display with out having to redraw the image. Palette Fade This technique is the ubiquitous "fade to black" and can be used for transitions between images. This is a very effective and simple special effect that can add a lot to a presentation. Cross Fade A palette transition is used to change the displayed image. This is accomplished by merging two images into one image that has a specially constructed palette. This technique is demonstrated in the MERGEDIB.ZIP sample code. See the Overview section, earlier in this technical note, for information on how to obtain this sample code. Blit Effects Instead of displaying the entire image at once, you can display it over time. For example, you can create a "wipe" effect by copying just a portion of the image at a time moving from left to right. The image will appear to come onto the screen a little at a time. Good authoring tools provide many effects that you can use. You should experiment with different types of effects. Some can be quite interesting if they don't take a lot of time to perform. ROP Effects Raster operations (ROPs), combined with changing brushes, can be used to create a special class of blit effects. By using an increasingly white brush, you can make an image appear to "fade" in as more and more pixels of the image appear on the display. Graphics Creation, Archival and Conversion Creation is the first step in the life cycle of a graphic. How a graphic is created and the source used greatly effects its quality. Before creating an image, you should know how it is going to be used and the factors of its use. For example, should the image be scanned at 24 bit or 8 bit? Should the image be dithered by the artist or left as is? Will the image be required to be displayed as-is on 16 color display? The answers to these questions will greatly effect how the image will need to be converted for distribution. After the image is created, it is stored until it is needed. For example, most photographs today come from archival companies and are called "stock photos". You should plan on creating a similar system for the digital images you will be collecting for your multimedia projects. For a particular project, the archived image must then be converted and modified to fit the needs of the planned application. To famliarize yourself with the multimedia project process, you should read the Multimedia Authoring Guide book provided in the Multimedia Development Kit (MDK) and also available from Microsoft Press in the Multimedia Authoring and Tools Guide. Creation Images can be created in the following ways: Computer Generated: the computer generates the image directly using a program such as 3D Studio from Autodesk . Computer Artist Drawn: an artist draws the image using a computer drawing program. Scanned: a technician scans (digitizes) the image using a computer program and special hardware. The source can be a photograph, a slide, or even a still-frame taken with a video camera. Video Capture: a technician captures (digitizes) an analog video signal using a computer program and special hardware. This differs from scanning in that it is a sequence of related images. After the image is created, a certain amount of editing must be done. At a minimum, a coordinator must sign-off on the image's quality and content. The image might also need to be enhanced or otherwise modified once it is digitized. This will require a computer artist working with a paint program. Archival The multimedia producer should plan on using the image in future projects. Archiving the image for future use is a good way to amortize the cost of creating the image over multiple projects. This means that the image should be stored in the highest quality format available for future use. For images, this generally means 24 bit (for waveforms, it means 44.1KHz and 16 bits). The image should be free of any edits, except for minor touch ups. For example, dithering should not be done at the archival image stage. A good archival database tool is critical when a large number of images are must archived. After scannning thousands or even just hundreds of images, it is very hard to find a particular, especially when you have several images of similar content (even filenames can't be relied on for clues). A good database tool will allow you to keep keywords as well as dates and other information about each image. In addition, the RIFF file format allows you to store information directly in the file. Look for tools that allow you to use the RIFF INFO fields. Conversion Once an image is chosen for use in a specific project, conversion and other editing may need to be performed. You will need to know what color depths and screen sizes must be supported. To support both 256 colors and 16 colors, you might need to supply two copies of the image for distribution, allowing the application to choose the image based on the current environment. For example, Viewer provides automatic image selection based on color depth. Images generated for use in Windows should have 236 colors. This is because Windows itself uses 20 colors for the user interface and (usually) only 236 are available for an application to use. An optimization for blit speed insures that a palette of 256 colors has the 20 system colors where and when Windows requires them. This type of palette is called the identity palette because the palette from the application matches exactly the Windows palette and no translation is required. If multiple images must be displayed at the same time, remember that the total number colors is limited to 236 (on a 256 color device). This means that images should share a common palette so that the image creator can control how they will look together on a screen. The image- editing tools in the MDK can be used for this process. Other conversions must be determined by the optimizations needed by the application. For example, a video sequence must be compressed into an acceptable format before it is distributed. The AVI Beta will provide tools to perform this conversion. Graphics Loading and Storage Once the image is distributed on CD-ROM, the application must read the image from the disc and then store it into memory before it can display it. Compression is very important as it improves both load time and storage space. Load Time Graphics data can be quite large. And, while a CD-ROM can hold many images, the transfer rate is just 150K/second, and a full screen image at 640x480x256 is about 300K. This image takes at least 2 seconds to just read from disc, and this does not include overhead of actually opening the file. You can try moving often-used images to the user's hard drive, but you should limit this to a bare minimum of space. Try to keep your application from using more than 2MB total, including the application itself. To help improve the file read times, try to keep the file compressed on disc so that the I/O time is minimized. For computer images, the RLE compression of DIBs is very nice. However, this compression does not work well with natural images. The Joint Picture Experts Group (JPEG) has developed a compression format that is optimized for natural images. However, it is new and is not yet supported by Windows directly (and image editing tools are just starting to support it). For video images, the Motion Picture Experts Group (MPEG) has a compression format defined for video and audio. However, this format is even newer than the JPEG format. In addition, it is a very complicated format; acceptable performance requires hardware assistance. You can also make tradeoffs between image quality and size and decompression time. JPEG and MPEG already provide for this capability in the definitions of their algorithms. However, other algorithms can also be used to obtain lossy compression. For example, you could use RLE compression to obtain a lossy delta frame compression of a video image. AVI uses this technique in order to provide a tradeoff between image quality and size given a fixed 150K/second transfer rate. The disadvantage of using compression is that it takes time to decompress the image before it can be displayed. Make sure that the file read time you saved doesn't get lost in a slow decompress time. If you have several smaller images, it is better to combine them into one larger bitmap to improve both file read time and memory usage. Just opening a file on a CD-ROM can take 2 seconds if the directory entries are not cached by MSCDEX. Having one file to read can greatly reduce the time spent just reading files. Be sure you can reference just a part of the image with the authoring tool you are using for your application. Memory Usage Images are kept in memory only as long as they are needed, or they are expected to be needed. This provides the application with the quickest response time to user actions. Most graphics will spend very little time in memory, while others spend most of their time there. For example, the bitmaps for the Windows maximize button is almost always in memory because it is used so often. Effecient use of memory caching of images can greatly improve the performance of a UI graphic because the images will already be in memory when they are needed. If the file is on the disc in a compressed form, it might make sense to keep it compressed in memory until it is actually drawn. RLE compression is perfect for this as the driver does the expansion (decompression) of the image as needed. For larger images, this can significantly reduce the memory requirements at the expense of having to decompress the image every time it must be displayed. However, most compression formats don't easily allow copying just a portion of the image without decompressing the entire image first. This means that if you are using the image for a background while using the dirty-area optimization, you will want to keep the image decompressed in memory. General Compression Issues To obtain optimal compression and image quality, it is important to understand the images you will be using and the different capabilities and limitations of the compression algorithms available. Types of Compression Compression algorithms take the following forms: Intraframe Interframe Lossy Lossless Intraframe Intraframe is the compression of the data for a single frame. This doesn't differ from compression used on a single image in a video or a single picture. Interframe Interframe compresses the video by only encoding the differences or delta between frames. This is commonly called delta-frame compression. This type of compression takes advantage of the fact that not much of the image changes between frames of a video. Lossy and Lossless Lossy and lossless are two terms that describe the quality of a compression. Lossless means that no data is lost; the decompressed image is exactly the same as the original. Lossy means that the decompressed image is not the same, but looks as close as possible to the original. Lossy compression affords the best compression rates, but depends on good quality algorithms to determine what information can be lost and still provide good video quality. A good compression algorithm will give the technician control over the tradeoffs between image quality and image size. Lossless compression has no reduction in image quality whereas a lossy compression has at least some loss (but it may appear to be the same to the human eye). Be sure to test the results of new compression methods with a number of users to get a good idea of the quality of the final images. Also, make sure to use different types of images when testing the algorithm to get a good idea of the capabilities and problems with the algorithm. The RLEAPP sample code (see the Overview section, earlier in this technical note, for locations of the sample code) shows video using the compression method that is directly supported by Windows (RLE). Natural Images and Computer Images If your image is simple in its use of color changes, then RLE compression is the best choice. On the other hand, if you have a natural image, then you will have to search for an appropriate algorithm to use in your application. JPEG or GIF are the most common choices. Dithering This is a pixel-depth conversion process usually used when few colors are available. The images printed in this document are dithered to two colors (black and white) when they are printed. The process trades pixel resolution to gain (perceived) color depth. Squares (for example, 2x2, 4x4, etc.) of pixels are created to represent to the eye more colors than are actually available. Print media uses a similar technique to obtain a range of shades from just 3 or 4 actual colors. You should know that this algorithm has the inherent side effect (or artifact) of changing a computer image into a natural image. Available Compression Algorithms Since CD-ROM is a read-only distribution media, the optimimal compression place emphasis on the image decompression speed and tradeoff the complexity for compression. The complexity (which is usually directly related to amount) of compression must be controlled for the target environment. For example, AVI uses simple compression because playback is intended for every MPC. On the other hand, DVI requires special hardware for decompression. It can obtain much better images because the decompression is more complex, but requires the user to have expensive hardware. None When the image is small and compression wouldn't help much, or you want the image to be readable by a wide range of applications, this may be the best choice. RLE Run Length Encoded. This is a common form of compression that involves encoding "runs" of colors into fewer bytes. For example, instead of having a black background of repeated pixels, the RLE format would encode this to just a couple of bytes consisting of run length and pixel value. There is a specific RLE format defined in Windows 3.0 as a DIB. This RLE format also allows for skips and jumps, making it ideal for delta-frame compression. The main advantage of this format over any other is that it is directly supported by Windows and is therefore very fast. JPEG Joint Picture Experts Group. This is a compression algorithm defined by JPEG. It is based on 24-bit natural images. One main limitation is that it is not directly supported by Windows and is therefore slower because an image must be decompressed into an image format that Windows can understand (DIB). Another limitation is that is works on 24-bit images. This means that a lot of memory and processing can be wasted if the display is only 8-bit (the usual case for multimedia on PCs today). GIF This is a file format owned by CompuServe. It is generally a good compression format, but JPEG is better (even though JPEG works on 24- bit images, you get better compression by expanding the file to 24-bits and then using JPEG to compress it). DVI Digital Video Interactive. This is a hardware and software solution to video graphics. Intel has announced that the algorithm implemented will move to MPEG (now that it is defined). This is a very high quality (compared to AVI) image compression. However, it requires hardware to obtain the image quality and size. Unfortunately, this hardware currently costs about $2,000, so not many of your users will have this hardware. AVI Audio Video Interleaved. Microsoft's software-only video solution to video graphics. This algorithm is optimized for 8-bit images and the quality is good, but image size is limited. The main advantage of this system over DVI is that AVI will run on any MPC system and doesn't require additional hardware. This system will be in Beta in the second quarter of 1992. MPEG Motion Picture Experts Group. This algorithm has just recently been defined. It is a very complex algorithm and will require additional hardware to achieve acceptable frame rates. DVI has committed to supporting this industry standard in their hardware. Custom Compression You can create your own compression algorithms that are optimized for use on your images. In theory, you can get very good compression, but you still must convert your images to DIB format in order to display them under Windows. Graphics Display Windows uses a device-independent system to provide a GUI on many PC systems. The advantage to the user in a GUI is that hardware selection can be based on price and capabilities. The advantage to the ISV is that their application runs on all these platforms, increasing their market. In addition, more and more users will insist that their applications run in the prefered GUI environment. The problem for the developers of graphics programs is that the performance of the system is dependent on the system (CPU, video hardware and drivers) that the user has. A bad driver can adversely effect the capabilities of the system for multimedia. In the Multimedia Group at Microsoft, a lot of time has been devoted to optimizing the video drivers for images. You should be aware that some users might have an older driver that is not very fast. In fact, some old drivers are an order of magnitude slower in displaying graphics. These users should upgrade to the newer drivers when they are available from their video card hardware vendors. Draw Speed The main speed problem when drawing DIBs is color translation. Color translation is explained in Section 2.3 of the Programmer's Reference of the Windows 3.0 SDK documentation. Optimization is necessary to eliminate the need for the driver to translate the pixel values before putting them into display memory. The following can improve draw speed: Avoid color translation Use the 'identity' palette Use DIB_PAL_COLORS Use dirty redraw (only what needs to be changed) Optimize your code to use 32-bit data for image operations. Examples of using 32-bit code and drawing directly into DIBs with custom routines are demonstrated in the TRIQ.ZIP sample code. See the Overview section, earlier in this technical note, for information about obtaining this sample code. Dirty Area This optimization can result in dramatic increases in speed. You should try to design your application to take advantage of this technique. You should only update the dirty areas of the screen. This method is easiest to implement when the application is moving an object around on the screen, such as sprites. For example, in Mixed Up Mother Goose from Sierra Online, the character moves around the screen but only the area around the character is redrawn each time it changes. To accomplish this, the background bitmap is kept in memory along with the image(s) of the character (in addition to an off-screen copy of the display). When an area of the screen is determined to have changed, the areas that have changed in the off-screen copy are drawn with the background. Then the character (or other sprites) are drawn onto the off-screen copy using transparent blits. Finally, the area that has changed is copied from the off-screen bitmap onto the display. This method can also be used when multiple sprites are moving on the screen; just apply the technique to each sprite in turn and, when the off-screen image is updated, update the dirty areas. You can also optimize for the case where sprites are next to, or touching, each other. Delta-frame animation is one application of this method. In delta- frame, only the differences between frames are stored and displayed. Because the information that is copied to the screen is much smaller, the update takes less time. In addition, the size of the data is much smaller. AVI uses this technique to display compressed video directly from the CD-ROM. The RLEAPP.ZIP sample code demonstrates this optimization using RLE DIBs. See the Overview section, earlier in this technical note for information on how to obtain the sample code. Offscreen Bitmap This is an important optimization tool that simplifies updates of dirty areas. This optimization takes advantage of the fact that drawing to memory is usually much faster than drawing to the screen. This will have the most benefit when the display is a composite of multiple images. It is usually used in conjunction with dirty area redraw optimization to obtain high-quality animations. The off-screen bitmap needs to be at least the size of the image on the screen. It can actually be bigger if you want to optimize scrolling the image on screen; if it is already drawn in the off-screen bitmap, just update the screen with the new portion of the image. You don't need to use an off-screen bitmap to perform the dirty update technique, just do the updates directly on the screen. The result, however, will flash as the background and then each of the sprites are drawn. However, this does use less memory when the off-screen bitmap is large. Palettes Optmizing the usage of color can provide improved performance. DIB_PAL_COLORS This is a flag on the DIB bitmap functions, telling the functions that the bitmap data (pixels) consists of indexes into the current logical palette. This means that no translation is needed from bitmap's palette into the logical palette. This can enhance display of the bitmap because the application has control over the color matching between multiple bitmaps that may use different palettes. Identity Palette This is the process of making the logical palette exactly match the system palette. This is accomplished by putting the appropriate system colors into the reserved places in the palette. Sample code is available that demonstrates this technique in RLEAPP.ZIP. See the Overview section earlier in this technical note for information on how to obtain this sample code. When DIB_PAL_COLORS is combined with an identity palette, no pixel color translation is needed and the driver can just copy the image data directly into the display memory. Common Palettes This is the process of using a common palette between multiple images displayed on the screen at one time. This is not a draw speed optimization, but it does optimize the quality of display of the images. This is because the technician who creates the images has control over the resulting palette instead of the application or Windows. Fade A changing palette can be animated to create special effects. The most useful is a fade where an image appears to disappear from the screen. This is a pretty simple process (you need to use DIB_PAL_COLORS to fade the image up from a single color). Palette animation can also be used to create an animation directly instead of using a sequence of images. This technique is demonstrated in the LAVA.ZIP sample code. See the Overview section earlier in this technical note for more information about where to obtain the sample code. Cross Fade A palette transition is used to change the displayed image. This is accomplished by merging two images into one image with a specially constructed palette. This technique is demonstrated in the MERGEDIB.ZIP sample code. See the Overview section earlier in this technical note for information on how to obtain this sample code. Transparent Blits Transparent blits allow the compositing of several images onto a final image. There are three ways to blit an image with transparency: Multiple ROPs. This method works on any Windows driver. It involves a combination of raster operations (ROPs) to obtain the desired effect. It is very slow as the image must be copied four times to obtain one transparent blit. Image mask. This method works on any Windows driver. It involves a combination of raster operations (ROPs) to obtain the desired effect. It is slightly faster than the multiple ROP method because it only requires three blit operations. However, it does require an image mask. NEWTRANSPARENT. This is the fastest method, but it only works with Windows video drivers that support it (all the multimedia drivers). The TRANSBLT.ZIP sample code demonstrates all of these methods. See the Overview earlier in this technical note section for information about where to obtain the sample code. Glossary Animation: The display of a series of a graphic images, simulating motion. Animation can be frame-based or cast-based. The Movie Player included with the Multimedia extensions uses cast-based animation. Applet: An application started from the Control Panel. Control Panel applets each configure a particular system feature; for example, printers, video drivers, or Page: 38 system sounds AVI (Audio Video Interleaved): Microsoft's Multimedia Systems Group is developing a product that synchronizes motion video and audio without the use of specialized hardware. This could be used to create "talking head" presentations, display animation sequences, and present multimedia slide shows. Blit: Copying one image to another. The copy can include ROPs to modify how the destination image appears. Blit Tear: A visible (to the user) artifact when blitting. This is usually caused by the display updating the monitor as the image is being copied. The monitor ends up with one frame with half the new image and half the old image. This is particularly visible with large images. Cast-Based Animation: Also called "sprite animation". A method of animation that uses many graphical elements that are combined (or composited) while the animation is playing into a final image that is displayed. Contrast with frame-based animation. CD-DA (Compact Disc-Digital Audio): An optical data-storage format that provides for the storage of up to 73 minutes of high-quality digital-audio data on a compact disc. Also known as Red Book audio. This is the standard audio CD format known around the world. CD-I (Compact Disc Interactive): An interactive multimedia player which attaches to TV sets and reads special optical discs (CD-I format). The CD-I disc standard integrates data, still-frame images, audio and motion video on the same disc. This standard is being developed by Philips. Also know as Green Book. CD-ROM (Compact Disc-Read Only Memory): An optical data-storage technology that allows large quantities of data to be stored on a compact disc. Also known as Yellow Book. CDTV (Commodore Dynamic Total Vision): An interactive multimedia CD- ROM player which attaches to TV sets and reads CD-ROM discs. This Commodore proprietary hardware is targeted to the home and directly competes against CD-I. CD-XA (CD-ROM eXtended Architecture): An extension of the CD-ROM standard that provides for integrated (interleaved) storage of compressed audio data along with other data on a CD-ROM disc in a media- dependent form. This format is the same format that is used by CD-I. This standard also defines the way data is read from a disc. Audio data is combined with text and graphic data on a single track so they can be read at virtually the same time. These extensions are now included in the standard CD-ROM definition (Yellow Book). CMYK: Cyan, magenta, yellow, black. A color encoding method commonly used in printing. The black is not really needed, but is included as a specific color because it is used so often. Many color printers use these four colors in a ribbon to print color images. Collision Detection: Used in cast-based animation to determine when two images have touched or collided. Used mainly in games; for example, detecting when the missile hits the player's ship. Color Cube: A particular color is composed of three components. Since there are three components, the range of colors can be expressed in three dimensions, or a cube. A color cube refers to the color spectrum that can be represented by a particular color encoding method; for example, RGB or HSL or NTSC. Color Depth: The number of bits assigned to represent color information in an image. Color depth can also refer to the perceived (by the user) amount of color information. Choices are usually 1 bit, 4 bits with a palette, 256 bits with a palette, 16 bits without a palette and 24 bits without a palette. The color depth of the palette is usually 24 bits. Composite: To combine. This is the generation of a single image from multiple image elements. It forms the basis of cast-based animation. Composite Video: An analog video signal that has the components (hue, saturation, luminance, HSL) combined into one signal. This type of analog video signal is of lower quality than an SVHS video signal. Compression: A digital process that allows data to be stored or transmitted using less than the normal number of bits. DVI is an example of hardware compression and AVI is an example of software compression. Computer Images: Defined in this techical note to be graphical images that do not contain many changes in color from pixel to pixel. "Many changes" is a relative term and it generally determined by the point at which a compression algorithm that is meant for image without many changes stops being effective. DVI (Digital Video Interactive): A proprietary technology developed by Intel (and licensed by IBM) for full-motion video at a high level of compression (hardware). This technology will be supported under Windows using MCI commands. DIB (Device-Independent Bitmap): A Windows bitmap data structure consisting of header fields, an optional color table (palette), and bitmap data. Depending on the number of colors represented in a given bitmap, the bitmap bits can be represented in 1, 4, 8, or 24 bits, with or without a palette. Dithering: This is a pixel-depth conversion process usually only used when a very few colors are available; for instance, 16 colors. The process trades pixel resolution for (perceived) color depth. Squares (for example, 2x2, 4x4, etc.) of pixels are created to represent to the eye more colors than are actually available. Print media uses a similar technique to obtain many shades from just 3 or 4 actual colors. This algorithm has the inherent side effect (or artifact) of changing a computer image into a natural image. Frame: A single image that the user sees displayed. This is the final image in cast-based animation, or the actual image in frame-based animation. Frame-Based Animation: An animation technique that displays a sequence of images. Usually these images are delta-framed. Frame-based animation allows for more complicated animations at a higher speed than any other animation technique because the data is precalculated. Frame Rate: The speed at which images are displayed. The faster the rate, the better the quality of the animation. Frame rate, image size, and image complexity are the three aspects of an animation which must be weighed to obtain the best quality. General MIDI: A synthesizer specification created by the MIDI Manufacturers Association (MMA) defining a common configuration and set of capabilities for consumer MIDI synthesizers. Hot Spot: An area of an image that, when selected by the user (usually by clicking with a mouse), performs an action in the application. Commonly, hot spots are represented graphically by buttons, but hot spots may be hidden to allow the user to explore the image to discover what happens. HSL: Hue, saturation, and luminance. This one method of representing a particular color. The luminance represents the brightness of the color and the Hue and saturation give the rest of the color information. This method is similar to what is used by NTSC television signals. Black and white TVs just decode the luminance portion of the signal to present a picture while color TVs decode the entire signal. IHV: Independent Hardware Vendor. A Microsoft term used to refer to companies in the PC computer industry who make hardware products (machines, upgrade kits, CD-ROM drives, etc.). IMA (Interactive Multimedia Association): A professional trade association of companies, institutions, and individuals involved in producing and using interactive multimedia technology. Interleaving: The technique of combining different types of data into one stream of data. This technique is used by AVI for CD-ROMs to combine image and audio data to create a video. ISV: Independent Software Vendor. A Microsoft term used to refer to companies in the PC computer industry who make software products (applications, tools, authoring systems, etc.). JPEG (Joint Picture Experts Group): A standards committee for still images. MCI (Media Control Interface): High-level control software that provides a device-independent interface to multimedia devices and media files. MCI includes a command-message interface and a command-string interface. This interface is supported by both IBM and Microsoft. MIDI (Musical Instrument Digital Interface): A standard protocol for communication between musical instruments and computers. MSCDEX (Microsoft Compact Disc Extensions): A terminate-and-stay- resident (TSR) program that makes CD-ROM drives appear to MS-DOS as network drives. MSCDEX uses hardware-dependent drivers to communicate with CD-ROM drives. MPEG (Motion Pictures Experts Group): A standards committee for motion video. MPMC (Multimedia PC Marketing Council): A subsidiary of the Software Publishers Association composed of 12 companies involved in developing Multimedia products. These companies currently include CompuAdd, Creative Labs, Fujitsu, Headland/Video 7, Media Vision, Microsoft, NCR, NEC, Olivetti, Philips, Tandy, and Zenith Data Systems. This industry standard platform is based on Microsoft's Windows with Multimedia extensions. Natural Image: As defined by this technical note, an image that has many changes in color from pixel to pixel. Contrast this with computer image. Off-Screen Bitmap: A copy of the image that is displayed on the screen that is in system memory. This is an optimization technique that takes advantage of the fact the on most PCs, system memory is much faster than display memory. This means that it is faster to carry out blit operations on system memory and then, when finished with the image, copy it to the display. This technique is especially beneficial when using cast-based animation. Palette: In Windows, a palette is a data structure defining the colors used in a bitmap image. Pixel: A single point of an image. A screen's resolution is commonly referred to in pixels (for example, a common form of VGA resolution is 640x480 with a color depth of 16 colors). Planes: When compositing, the relative Z-order of the images can be imagined as a sequence of planes in which the images are placed. The video hardware can also allow two or more planes of display with the foreground display plane having a special method to determine where the background plane shows through. Raster Graphics: A type of image fomat optimized for display on a raster device. This is the most common form of images. RGB: Red, green, blue. A color encoding method commonly used by computers. Compare this method to HSL and CMYK. RIFF (Resource Interchange File Format): A tagged-file specification used to define standard formats for multimedia files. Tagged-file structure helps prevent compatibility problems that often occur when file-format definitions change over time. Because each piece of data in the file is identified by a standard header, an application that does not recognize a given data element can skip over the unknown information. This specification is jointly supported by IBM and Microsoft. RLE: Run Length Encoded. Generically, this refers to a compression method for computer images that only encodes color changes instead of encoding each pixel. A specific definition of RLE is defined in the Windows DIB format. This RLE format also includes line skips and column jumps, making it ideal for encoding a delta-frame image of a computer image. ROP. Raster operation. The simplest ROP is SRC_COPY. More complicated ROPs allow combinations of the source and destination images along with a brush image. ROPs are explained in the Windows SDK documentation. See the Chapter "Binary and Ternary Raster-Operation Codes" in the Windows SDK documentation in the Programmer's Reference book. Scanning. Scanning is the process of digitizing an image. This is one way to create images. Skips and Jumps: This is a technique defined in the Windows RLE DIB format for encoding just the changes (deltas) from a previous image. The skips (rows) and jumps (columns) allows the efficient encoding of pixels that have not changed from the original image. Sprite: An image that is moved in the display to provide animation. The image itself can change while it is being moved to enhance the illusion. Vector Graphics: A type of picture optimized for display on a system that uses instructions for display. Contrast this with raster graphics. Video: Digital and Analog. Digital video is defined by this technical note as a precalculated sequence of images that are displayed to show an animation; audio is usually present. Analog video is common TV signals; audio is almost always present. Video Capture: The process of digitizing an analog video image. Video Compression: A method for reducing the amount of information required to store and recall a frame of video. Compression is critical to delivering digital full motion video to the user in the most effective manner (in terms of performance and storage costs). Video Overlay: Hardware that displays an analog video signal in combination with a computer display. The analog video is commonly on a separate video plane from the computer display. WAVE file: A Microsoft/IBM standard file format for storing waveform audio data. A WAVE file uses the RIFF format and has a .WAV filename extension. Z-Order: The order in which a sprite is displayed. Sprites that are displayed later than other sprites appear to be in front (closer) to the user. The "Z" refers to the third dimension (front/back).