Next | Prev | Up | Top | Contents | Index

Looking At/Reordering an Application

Many applications have routines that are executed over and over again. You can optimize program performance by modifying these heavily used routines in the source code. The following paragraphs describe the tools that will help tune your programs.


Analyzing Program Behavior with prof

Profiling allows you to monitor program behavior during execution and determine the amount of time spent in each of the routines in the program. There are two types of profiling:

PC sampling is a statistical method that interrupts the program frequently and records the value of the program counter at each interrupt. Basic block counting, on the other hand, is done by using the pixie(1) utility to modify the program module by inserting code at the beginning of each basic block (a sequence of instructions containing no branch instructions) that counts the number of times that each block is entered. Both types of profiling are useful. The primary difference is that basic block counting is deterministic and PC sampling is statistical. To do PC sampling, compile the program with the -p option. When the resulting program is executed, it will generate output files with the PC sampling information that can then be analyzed using the prof(1) utility. Both prof and pixie are not shipped with the basic IRIX distribution, but are found in the optional IRIX Development Option software distribution.

To do basic block counting, compile the program and then execute pixie on it to produce a new binary file that contains the extra instructions to do the counting. When the resulting program is executed, it will produce output files that are then used with prof to generate reports of the number of cycles consumed by each basic block. You can then use the output of prof to analyze the behavior of the program and optimize the algorithms that consume the majority of the program's time. Refer to the cc(1), f77(1), pixie(1), and prof(1) reference pages for more information about the mechanics of profiling.


Reordering a Program with pixie

User program text is demand-loaded a page (currently 4K) at a time. Thus, when a reference is made to an instruction that is not currently in memory and mapped to the user's address space, the encompassing page of instructions is read into memory and then mapped into the user's address space. If often-used subroutines in a loaded program are mixed with seldom-used routines, the program could require more of the system's memory resources than if the routines were loaded in the order of likely use. This is because the seldom-used routines might be brought into memory as part of a page of instructions from another routine.

Tools are available to analyze the execution history of a program and rearrange the program so that the routines are loaded in most-used order (according to the recorded execution history). These tools include pixie, prof, and cc. By using these tools, you can maximize the cache hit ratio (checked by running sar -b) or minimize paging (checked by running sar -p), and effectively reduce a program's execution time. The following steps illustrate how to reorganize a program named fetch

  1. Execute the pixie command, which will add profiling code to fetch:

    pixie fetch

    This creates an output file, fetch.pixie and a file that contains basic block addresses, fetch.Addrs.

  2. Run fetch.pixie (created in the previous step) on a normal set or sets of data. This creates the file named fetch.Counts, which contains the basic block counts.

  3. Next, create a feedback file that the compiler will pass to the loader. Do this by executing prof:

    prof -pixie -feedback fbfile fetch fetch.Addrs fetch.Counts

    This produces a feedback file named fbfile.

  4. Compile the program with the original flags and options, and add the following two options:

    -feedback fbfile

For more information, see the prof and pixie reference pages.


Next | Prev | Up | Top | Contents | Index