Next | Prev | Up | Top | Contents | Index

Checking for Excessive Paging and Swapping

The CPU can only reference data and execute code if the data or code are in the main memory (RAM). Because the CPU executes multiple processes, there may not be enough memory for all the processes. If you have very large programs, they may require more memory than is physically present in the system. So, processes are brought into memory in pages; if there's not enough memory, the operating system frees memory by writing pages temporarily to a secondary memory area, the swap area, on a disk.

IRIX overcommits real memory, loading and starting many more processes than can fit at one time into the available memory. Each process is given its own virtual section of memory, called its address space, which is theoretically large enough to contain the entire process. However, only those pages of the address space that are currently in use are actually kept in memory. These pages are called the working set. As the process needs new pages of data or code to continue running, the needed pages are read into main memory (called "faulting in" pages or "page faults") . If a page has not been used in the recent past, the operating system moves the page out of main memory and into the swap space to make room for new pages being faulted in. Pages written out can be faulted back in later. This process is called "paging" and it should not be confused with the action of "swapping."

Swapping is when all the pages of an inactive process are removed from memory to make room for pages belonging to active processes. The entire process is written out to the swap area on the disk and its execution effectively stops. When an inactive process becomes active again, its pages must be recovered from disk into memory before it can execute. This is called "swapping in" the process. On a personal workstation, swapping in is the familiar delay for disk activity, after you click on the icon of an inactive application and before its window appears.

When IRIX is multiprocessing a large number of processes, the amount of this swapping and paging activity can dominate the performance of the system. You can use the sar command to detect this condition and other tools to deal with it.

Determining whether your system is overloaded with paging and swapping requires some knowledge of a baseline. You will need to use the command under various conditions to determine a baseline for your specific implementation. For example, you could boot your system and run some baseline tests with a very limited number of processes running, and then again during a period of light use, a period of heavy networking activity, and then especially when the load is high and you are experiencing poor performance. The use of your system log book will assist you in making these baseline measurements.

Table 11-5 shows indicators of excessive paging and swapping on a smaller system.

Indicators of Excessive Swapping/Paging
Important Fieldsar Option
vflt/s - page faults (valid page not in memory)sar -p
bswot/s (transfers from memory to disk swap area)sar -w
bswin/s (transfers to memory)sar -w
%swpocc (time swap queue is occupied)sar -q
rflt/s (page reference fault)sar -t
freemem (average pages for user processes)sar -r

You can use the following sar options to determine if poor system performance is related to swap I/O or to other factors:

-u %wswp

Percent of CPU wait time owed to swap input. This measures the time during which every active process was blocked waiting for a page to be read or written. Values of even a few percent indicate a swap I/O problem.

-p vflt/s

Frequency with which a process accessed a page that was not in memory. Compare this number between times of good and bad performance. If the onset of poor performance is associated with a sharp increase of vflt/s, swap I/O may be a problem even if %vswp is low or 0.

-r freemem

Unused memory pages. The paging daemon (vhand) recovers what it thinks are unused pages and returns them to this pool. When a process needs a fresh page, the page comes from this pool. If the pool is low or empty, IRIX often has to get a page for one process by taking a page from another process, encouraging further page faults.

-p pgswp/s

Number of read/write data pages retrieved from the swap disk space per second.

-p pgfil/s

Number of read-only code pages retrieved from the disk per second.
If the %vswp number is 0 or very low, and vflt/s does not increase with the onset of poor performance, the performance problem is not primarily due to swap I/O.

However, when swap I/O may be the cause, there are several possible actions you can take:


Next | Prev | Up | Top | Contents | Index