NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / os / vms / 18293 < prev next >

Wrap

Internet Message Format | 1992-11-20 | 4.8 KB

Path: sparky!uunet!usc!news.service.uci.edu!network.ucsd.edu!news!nosc!crash!cmkrnl!jeh From: jeh@cmkrnl.com Newsgroups: comp.os.vms Subject: Re: TUNING - QUANTUM SYSGEM parameter Message-ID: <1992Nov20.120509.896@cmkrnl.com> Date: 20 Nov 92 12:05:09 PST References: <27A0322A_00174E60.00963CCA9D4FA020$626_2@UK.AC.KCL.PH.IPG> <1992Nov19.134754.25572@ncsa.uiuc.edu> Organization: Kernel Mode Consulting, San Diego, CA Lines: 82 In article <1992Nov19.134754.25572@ncsa.uiuc.edu>, jsue@ncsa.uiuc.edu (Jeffrey L. Sue) writes: > I have been to courses, and many DECUS sessions where Patrick O'Malley > discusses tuning quantum. What should be kept in mind when reducing > QUANTUM is that it may seemd to improve *trivial* performance, it reduces > "throughput". This means that in the same amount of wall-clock time, > you get less "total" work completed. This is a trade-off. > > Other factors to consider are: whenever a context switch takes place, > the on-cpu instruction cache is invalidated. Thus the more often you > context switch (smaller quantum), the more often the cache will become > invalidated. Therefore your job will get less done because it must rebuild > the cpu's instruction cache everytime it becomes the CUR processes, and it > is doing this more often. (note: the CPU cache is a hardware cache, not > one of the XQP (ACP_*) caches - which are for the file system.) A process context switch also invalidates the per-process half of the translation buffer (a cache for page table entries). And, of course, the regular old memory cache, while not invalidated by a context switch, will suffer a short-term reduction of hit rate after each context switch (since the new process typically won't be referring to very many of the same physical addresses as the old one did). However... This "context switches are expensive, so you'd better not reduce quantum by a lot" theory has been around since the 780 days. I've never seen any hard data one way or another, so I decided to run some tests, as part of the research for my "VMS Scheduling Myths" talk at DECUS. THe test was with a small compute-bound program designed to take advantage of the memory cache and translation buffer. (In fact, it reliably shows the size of the translation buffer, and I adjusted it to stay within the translation buffer size of each machine that I tested.) This image was run in two separate processes at the same base priority. The program reports its throughput (measured in array references per wall-clock tick) every few seconds. Priority adjustment was disabled by patching all of the boost values to 0. Result: There *must* be a throughput penalty for a context switch, but it's so small that it can't be measured, even on a MicroVAX II. Tests on a MicroVAX 3600, VAXstation 3100/76, VAXstation 4000/VLC, VAXstation 4000 Model 60, and VAXserver 3100 Model 80, were equally inconclusive. I discussed these results with Pat O'Malley. We decided that what must be happening is that it just doesn't take very long to reload the instruction cache, translation buffer, and RAM cache with stuff that's relevant to the new process. In my copious free time :-), I am investigating replacing the interval timer ("hardware clock") interrupts on a MicroVAX 3600 with interrupts from a DRV11-B or similar general-purpose interface. This will allow adjustment of the intervals between clock ticks to values much smaller than 0.01 second, so that I can see what happens when quantum is set to 0.001 seconds, or even shorter. The goal is to find the point where the context switch overhead DOES start to noticeably affect throughput. > I have played with quantum and found that on the smaller processors it > made a big difference in the performance of "trivial" response, like for > MAIL, DIRECTORY, DELETE, TYPE, etc.. Note that this effect will only be seen if you have more than one process competing for the CPU at the same normal priority level (0-15). > However, on the faster processors > today, rarely do these activities even use a quantum before they give up > the cpu voluntarily for a wait-state (I/O completion, LEF, etc.). Also, > in the cases where I lowered quantum, the longer-running operations tended > to take more time - No offense, but I would be interested in hearing about some quantitative measurements with programs with known compute-intensive, memory-reference-intensive behavior, rather than your "feel" of how the system behaved. --- Jamie Hanrahan, Kernel Mode Consulting, San Diego CA drivers, internals, networks, applications, and training for VMS and Windows-NT uucp 'g' protocol guru and release coordinator, VMSnet (DECUS uucp) W.G., and Chair, VMS Programming and Internals Working Group, U.S. DECUS VAX Systems SIG Internet: jeh@cmkrnl.com, hanrahan@eisner.decus.org, or jeh@crash.cts.com Uucp: ...{crash,eisner,uunet}!cmkrnl!jeh