This article is about performance — how to think about it, how to measure it, and how to improve it. Of course, performance is a complex subject, and Apple has a Performance Page as well as a robust set of Performance Documentation available. This article supplements that documentation by giving you a framework for thinking about performance, as well as performance tips that you can use immediately. Optimization: TheoryMaximizing your application’s performance is a two-step process. The first step involves ensuring that you are using the programming techniques most suited to Mac OS X architecture. The second step involves deciding what performance metrics are important and making sure your application meets those metrics. Apple provides a suite of versatile profiling applications and command-line tools to help you with this task. Step One: Programming for Mac OS XMac OS X, with its virtual memory and preemptive multitasking, is structurally different from all previous versions of Mac OS. This means that certain programming practices that were effective in Mac OS 9 don’t work well under Mac OS X. The first step toward maximizing your application’s performance is to code (or, in the case of existing Carbon applications, recode) tasks in the way that is most efficient for Mac OS X. Here are the most important Mac OS X-specific changes you should make your application. For details, see the Performance book.
Step Two: Targeted Metrics And OptimizationThe code changes you made in Step One are important, but this is where you’ll spend most of your time. Remember, you can’t improve performance until you first measure it, and you can’t know when you’re done until you have set a goal for your measurements. Integrating Metrics into the Development ProcessIt is important that you make the process of measuring and improving application performance an integral part of your development cycle. Here is a four-step plan to help you get started: First, decide which performance features are most important to you (see below for some suggestions). Second, for each feature, choose an appropriate performance goal (for example, a launch time of under one second). You may want to evaluate competing products and match or exceed their performance. Third, devise a procedure early in the development process for measuring each performance metric, using either custom code or an off-the-shelf tool. Fourth, compare your application against each metric on a regular basis, and solve performance problems early. Some development teams establish a policy of refusing to accept any code that causes the application to miss its performance goals. Choosing your MetricsYou’ll get the best performance if you concentrate on the following metrics:
A Methodology for Improving PerformanceOnce you’ve chosen your metrics and devised ways of measuring them, how do you go about improving your application’s performance? Many developers iterate a measure/analyze/recode loop until all their performance metrics meet the agreed-up on goal. “Measure” means using performance tools to gather data on how and where your application spends its time and system resources. “Analyze” refers to analyzing the data to find time and resource bottlenecks that cause the application to fall below metrics expectations. “Recode” involves devising an approach for reducing or eliminating these bottlenecks and then implementing that approach. Applying this methodology throughout the development process reduces the amount of work you have to throw away. Techniques for Improving PerformanceKeep the following techniques in mind when you are devising solutions to your performance bottlenecks:
Here’s an example of these techniques in action. When you’re trying to reduce your application’s launch time, look carefully at the contents of your main nib file. When your application launches, Mac OS X must load the main nib file before it can display anything on the screen. Loading a nib file can be very “expensive”; each object in the nib file must be instantiated and initialized, and the process may trigger the loading of a non-resident framework. If a resource is not needed to get your application to the point where it displays itself on the screen, you should move it to another nib file and load it after your application displays itself. Optimization: PracticeThe sections that follow give you “game plans” for five of the optimizations you are most likely to do. Minimizing Application Launch TimeMetricsConcentrate on minimizing the time interval from the moment you launch the application to the time at which your application finishes drawing all of its visible windows. ToolsThe The TechniquesIn those cases where you need to know exactly how much time a certain routine takes to execute, you can use fs_usage to make this measurement accurately (although this requires you to modify your code slightly). Add code to your application that “touches” imaginary pathnames — for example, Game PlanSearch the data from TipsWhile the Look for opportunities to cache and reuse data. For example, if you notice that your application scans in the contents of a given directory, determine whether the contents of the directory rarely change. If this is the case, rewrite your code to cache the directory’s contents along with a last-modification timestamp. Also, have your code check the directory’s modification timestamp and either use the cached data (if the directory hasn’t changed) or read the directory and rewrite the cache (if it has). For Cocoa applications (and for Carbon applications that use nibs), make sure that your application’s main nib file contains only those resources needed to support the initial display of your application. Put everything else in other nib files. If you’re writing a Cocoa application, put the code that is necessary for startup into the Optimizing Your Application’s Main FunctionsMetricsHere, you must decide which of your application’s main functions are the ones you are going to measure. For example, in a spreadsheet program, you might decide that spreadsheet recalculation time is an important metric. The time interval to measure is the interval from the end of the relevant user action to the moment when your application’s response is complete. Examine the performance of competing applications and set your goals appropriately. Tools and TechniquesUse the Game PlanSearch the data from TipsAs with the previous section, look for opportunities to defer operations that are not absolutely necessary and to speed up operations by caching data. Optimizing Memory UsageMetricsThere are two metrics you should look at: your application’s memory usage just after startup is completed, and its memory usage over time during simulated normal usage. If it is a Cocoa application, you should also examine its patterns of object allocation over time. ToolsThe top command-line utility (see the screenshot below) displays a periodically updated table of the CPU and memory usage statistics for each process in the system. You’ll be interested primarily in the column labeled RPRVT, which stands for “process resident private memory” — that is, the amount of memory that each process is currently using. Use the Use the For Cocoa applications, you can use the ObjectAlloc application (see the screenshot at the beginning of this article) to track over time how the application’s objects are allocating memory. Game PlanYour first step should be to use the tools described above to detect and eliminate obvious memory leaks. Once you have done that, the task that remains — improving your application’s memory usage until it meets your metrics’ performance goals — is important, but it requires diligence and judgment to determine when you are finished. Although you can easily obtain metrics (numbers) related to memory usage, it is harder to establish precise goals for these metrics. The best you can do is to do a “reality check” on how your application’s metrics compare to those of an application known to have good memory usage (the Mac OS X implementation of TextEdit is a good example). You should also analyze the numbers returned by the tools described above, searching for indications of possible problems. The following example should give you an idea of how this approach works. Suppose you check the PRPVT value for TextEdit immediately after launch and find that it uses 700 KB of memory. You launch your application and compare its visual complexity to that of TextEdit. Suppose that your application uses, for example, 4 MB of memory. Obviously, you would expect your application, being visually more complex, to require more memory than TextEdit does. The question to ask yourself is whether or not the additional human-interface elements that your application initially displays can reasonably account for the extra 3.3 MB of memory being used. If you don’t believe that the extra memory usage is reasonable, you should analyze your application to determine what code is using the extra memory and whether or not the execution of that code can be deferred or eliminated. TipsSearching for potential memory problems and finding their causes are difficult tasks. You will need patience, discipline, and a good working knowledge of your tools and how to use them. As always, check the Performance book for information that can help you with this optimization task. By observing the memory allocations that occur, you can infer which sections of your code are being executed. From this, you may discover opportunities to defer or eliminate the execution of some code. In particular, there may be alternatives to certain memory allocations that occur within loops, especially multiply-nested loops. If you’re a optimizing a Cocoa application, watch for the effects of autoreleased objects, especially in multiply-nested loops. In certain situations, a large number of autoreleased objects may increase the size of the autorelease pool, which may cause significant memory use until the pool empties. You may be able to prevent such a situation by manually allocating and releasing certain objects at the right places. Optimizing Drawing OperationsMetricsYour application should draw static images as close to instantaneously as possible. It should also draw moving images frequently enough for movement to appear smooth and flicker-free. You can tell where you have problems by watching for noticeable drawing delays during normal use of your application. ToolsAs described earlier, you can use Game PlanFirst, execute your program along with QuartzDebug and analyze the region updates for situations that indicate that your drawing code is not as efficient as it could be. In particular, watch for regions that are updated multiple times, regions that are updated even when their content remains unchanged, and regions that are updated when only a smaller section of the region actually changes. Next, examine the speed of your application’s drawing operations during normal use, as described in the Metrics section, above. Use the The responsiveness of your application during live resize operations (interactive resizing of windows or table column widths, for example) is always important. If such operations are too sluggish, consider various ways of simplifying the redraw process to make the live resizing more responsive. TipsWhen your Cocoa application frequently redraws two small regions diagonally oriented toward each other, the Cocoa view system may instruct your application to redraw a single, large rectangle that encloses both small regions instead of redrawing the two small regions. When this occurs, you may be able to improve your application’s performance by forcing it to redraw the two small regions. The way to do this varies according to the situation. One thing you can try is changing how often each small regions gets redrawn (for example, by alternating which region redraws during successive drawing cycles). Instead of redrawing a given region every time, you may want to check it to see if it has changed and the redraw it only when necessary. Do not waste CPU resources by redrawing an image more often than is needed. A refresh rate of 20 times per second is usually sufficient, and you may find even lower refresh rates acceptable. If you have optimized your application as much as you can and still are not meeting your performance goals, you can try various performance tricks to make your drawing more responsive. All such tricks involve simplifying the drawing process in ways that the user does not notice, or notices but still finds acceptable. For example, when necessary, you may redraw your window less frequently. In the case of the live resizing of your application’s window, you may decide that it is acceptable to not redraw the window contents while the user is dragging the grow box, but rather display a cached image during the drag operation and redraw the window contents only when the user releases the mouse button. Optimizing Text Drawing OperationsMetricsThe metric to be minimized is the time taken to draw text. This number should be minimized as part of the responsiveness and drawing optimization steps. ToolsUse the Game PlanUse the data produced by TipsTo prepare your application for global markets, you should be using Unicode for all text storage and manipulation. However, be aware that the Apple-supplied Unicode text-manipulation APIs have different performance characteristics from Apple’s legacy ASCII text-manipulation APIs, and that you need to be aware of these differences to maximize your application’s text-drawing performance. For example, text-layout operations are “expensive,” with the consequence that you should use layout and style objects wisely. In particular, you can reuse a single layout object for multiple paragraphs simply by changing the text it points to. Where appropriate, you can cache and reuse style objects instead of recreating them. Also, when you need to measure the width of the text string, in most cases you can use the |