home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!usc!news.service.uci.edu!network.ucsd.edu!news!nosc!crash!cmkrnl!jeh
- From: jeh@cmkrnl.com
- Newsgroups: comp.os.vms
- Subject: Re: TUNING - QUANTUM SYSGEM parameter
- Message-ID: <1992Nov20.120509.896@cmkrnl.com>
- Date: 20 Nov 92 12:05:09 PST
- References: <27A0322A_00174E60.00963CCA9D4FA020$626_2@UK.AC.KCL.PH.IPG> <1992Nov19.134754.25572@ncsa.uiuc.edu>
- Organization: Kernel Mode Consulting, San Diego, CA
- Lines: 82
-
- In article <1992Nov19.134754.25572@ncsa.uiuc.edu>,
- jsue@ncsa.uiuc.edu (Jeffrey L. Sue) writes:
- > I have been to courses, and many DECUS sessions where Patrick O'Malley
- > discusses tuning quantum. What should be kept in mind when reducing
- > QUANTUM is that it may seemd to improve *trivial* performance, it reduces
- > "throughput". This means that in the same amount of wall-clock time,
- > you get less "total" work completed. This is a trade-off.
- >
- > Other factors to consider are: whenever a context switch takes place,
- > the on-cpu instruction cache is invalidated. Thus the more often you
- > context switch (smaller quantum), the more often the cache will become
- > invalidated. Therefore your job will get less done because it must rebuild
- > the cpu's instruction cache everytime it becomes the CUR processes, and it
- > is doing this more often. (note: the CPU cache is a hardware cache, not
- > one of the XQP (ACP_*) caches - which are for the file system.)
-
- A process context switch also invalidates the per-process half of the
- translation buffer (a cache for page table entries). And, of course,
- the regular old memory cache, while not invalidated by a context switch,
- will suffer a short-term reduction of hit rate after each context switch
- (since the new process typically won't be referring to very many of the
- same physical addresses as the old one did).
-
- However...
-
- This "context switches are expensive, so you'd better not reduce quantum by a
- lot" theory has been around since the 780 days. I've never seen any hard
- data one way or another, so I decided to run some tests, as part of the
- research for my "VMS Scheduling Myths" talk at DECUS.
-
- THe test was with a small compute-bound program designed to take advantage of
- the memory cache and translation buffer. (In fact, it reliably shows the size
- of the translation buffer, and I adjusted it to stay within the translation
- buffer size of each machine that I tested.)
-
- This image was run in two separate processes at the same base priority. The
- program reports its throughput (measured in array references per wall-clock
- tick) every few seconds. Priority adjustment was disabled by patching all of
- the boost values to 0.
-
- Result: There *must* be a throughput penalty for a context switch, but it's
- so small that it can't be measured, even on a MicroVAX II. Tests on a
- MicroVAX 3600, VAXstation 3100/76, VAXstation 4000/VLC, VAXstation 4000
- Model 60, and VAXserver 3100 Model 80, were equally inconclusive.
-
- I discussed these results with Pat O'Malley. We decided that what must be
- happening is that it just doesn't take very long to reload the instruction
- cache, translation buffer, and RAM cache with stuff that's relevant to the
- new process.
-
- In my copious free time :-), I am investigating replacing the interval timer
- ("hardware clock") interrupts on a MicroVAX 3600 with interrupts from a
- DRV11-B or similar general-purpose interface. This will allow adjustment of
- the intervals between clock ticks to values much smaller than 0.01 second, so
- that I can see what happens when quantum is set to 0.001 seconds, or even
- shorter. The goal is to find the point where the context switch overhead DOES
- start to noticeably affect throughput.
-
- > I have played with quantum and found that on the smaller processors it
- > made a big difference in the performance of "trivial" response, like for
- > MAIL, DIRECTORY, DELETE, TYPE, etc..
-
- Note that this effect will only be seen if you have more than one process
- competing for the CPU at the same normal priority level (0-15).
-
- > However, on the faster processors
- > today, rarely do these activities even use a quantum before they give up
- > the cpu voluntarily for a wait-state (I/O completion, LEF, etc.). Also,
- > in the cases where I lowered quantum, the longer-running operations tended
- > to take more time -
-
- No offense, but I would be interested in hearing about some quantitative
- measurements with programs with known compute-intensive,
- memory-reference-intensive behavior, rather than your "feel" of how the system
- behaved.
-
- --- Jamie Hanrahan, Kernel Mode Consulting, San Diego CA
- drivers, internals, networks, applications, and training for VMS and Windows-NT
- uucp 'g' protocol guru and release coordinator, VMSnet (DECUS uucp) W.G., and
- Chair, VMS Programming and Internals Working Group, U.S. DECUS VAX Systems SIG
- Internet: jeh@cmkrnl.com, hanrahan@eisner.decus.org, or jeh@crash.cts.com
- Uucp: ...{crash,eisner,uunet}!cmkrnl!jeh
-