NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / unix / programm / 4308 < prev next >

Wrap

Internet Message Format | 1992-08-14 | 7.7 KB

Xref: sparky comp.unix.programmer:4308 comp.unix.internals:1692 Path: sparky!uunet!dtix!darwin.sura.net!mips!sdd.hp.com!caen!hellgate.utah.edu!fcom.cc.utah.edu!gateway.univel.com!ns.Novell.COM!gateway.novell.com!ithaca.Eng.Sandy.Novell.COM!terry From: terry@ithaca.Eng.Sandy.Novell.COM (Terry Lambert) Newsgroups: comp.unix.programmer,comp.unix.internals Subject: Re: SVR4 semaphore overhead??? Message-ID: <1992Aug14.222159.12495@gateway.novell.com> Date: 14 Aug 92 22:21:59 GMT References: <1992Aug8.131020.17966@ttsi.lonestar.org> <1992Aug13.020203.7490@ttsi.lonestar.org> Sender: news@gateway.novell.com (NetNews) Organization: Novell NPD -- Sandy, UT Lines: 149 Nntp-Posting-Host: ithaca.eng.sandy.novell.com In article <1992Aug13.020203.7490@ttsi.lonestar.org> mse@ttsi.lonestar.org (Mark Evans) writes: >In article <1992Aug8.131020.17966@ttsi.lonestar.org> I wrote: >>I have seen postings from time to time regarding the overhead of >>semaphores in SVR4, but not many followups. I'm now in the boat >>of considering the usage of semaphores for mutual exclusion and >>resource coordination between a substantial number (~30) of >>processes. There is a potential for a large number of semaphores >>(60-90) to get allocated. >> >>Someone recently posted an observation that as the number of >>semaphores in a system grows, so does the overhead. Can anyone >>confirm this? > >On a Tandem Integrity R3000 machine running SVR4, I found no >degradation in performance on a given semaphore as the number of >semaphores in a set increased from 1 to 25 (current sysgen'd max). >I am chagrined, however, at the 70 microseconds or so taken to >do a simple semop() lock, as compared to the .82 microseconds it >takes to call _test_and_set(), a mips-supplied routine in >libc.a which performs an indivisible test-and-set (actually, it's >read-modify-write). As suggested in previous articles and by >respondents to my original post, I am cooking up a >lightweight semaphore package using _test_and_test() for the >lock, and system semaphores (or message queue or pause/signal) >for sleep/wakeup when contention arises. > >A couple of questions have come up that require expert knowledge: > >(1) Is there a way for a process to reschedule itself, i.e., go to >the end of the current ready-to-run list? I don't want to resort >to a sleep_ms() call if I can avoid it. Under SVR4, there is no way to do this (I will assume that you are talking about SVR4 due to the Subject: line) because the scheduler code has been intentionally seperated. You would have to be able to determine the order in which something is removed from the ready-to-run queue and placed on the run queue, and you can't do that contextually. The Tandem _test_and_set() uses a hardware mechanism (precisely like that one available on VAXen), and as such is not relative to your lbolt value, but is instead relative to your system clock. Semaphores and other consumers of CPU quantums are in terms of lbolt. >(2) When multiple processes contend for a system semaphore, are >they rewarded in FIFO order as the semaphore becomes available? >I have gotten the impression from previous testing that this is not so. >Can anyone explain the data structures & algorithm used behind the >scenes? No, they are not. A wakeup is issued on the address, which moves all the processes sleeping on it to the ready_to_run queue. The scheduler then picks one to run at the next timer interrupt. This may not be the process who was waiting the longest (although the longer a process is waiting, the higher it's priority in the queue), and so, yes, you can get starvation. So what you believe is correct: Semaphore requests are not queued. Now when the processes awaken, each attempts to grab the semaphore; the first one returns to user land; the rest go back to sleep. This is called "the thundering herd" problem, and has been at least partially resolved in Sequent's OS, but a fat lot of good that'll do you. >Writing this simple lock/unlock package at the user level without >resorting to priority controls is surprisingly mind-boggling if you >consider all the races. I'm on my fourth "I've got it" approach. >When I really do "have it," I'll happily share the results to the >many who have expressed interest in the subject. If you think >you have a solution, I know many out here would love to see it. One soloution is the creation of a system call. You can do this on SVR4 even though their equivalent of the init_systent.c is pre-linked into os.o and is thus not user-replacable. The way you do this is to go through the table looking for the nosys() function pointers, and replace that entry. The reason you use a driver is to allow you a known query point (via ioctl()) to find out which entry to use "syscall()" with, and to provide system startup initialization of the table entry and any data structures. I have done this in two different instances to provide a FIFOed semaphore mechanism with low overhead. Your single system call is then divided, via it's first parameter, into multiple libarary entry points for "lws_init()" to init the semaphores by returning you the sytem call to use; lws_init() then uses this entry point to make a syscall() on your behalf with the number semaphores you will need. Another ioctl is made to indicate to the driver what needs to be cleaned up on close by this process, and then the fd is left open to allow you to resource-track close without specific deinitialization which might be defeated by a core dump or abnormal exit. A call "lws_create()" returns the index of an entry in your table of semaphores allocated during the init or -1. A call "lws_release()" returns the index to the unallocated pool. A call "lws_acquire()" acquires the given index. This will sleep on the address of the semaphore requested (ie: &my_sems[ idx]) in the kernel if the semaphore is currently allocated. If the semaphore is allocated, the pid of the allocator is stored in the semaphore struct as well so that resource track cleanup of the process can wake up sleepers so that the semaphore in question is not deadlocked. A call to "lws_free()" frees the given index, and does a wakeup on the associated elements address so that the next requestor can proceed. order: lws_init() [Any number of:] lws_create() [Any number of:] lws_acquire() lws_free() lws_release() [Resource track cleanup by close of fd on process exit] FIFO order can be enforced by either ignoring "the thundering herd" problem and keeping an outstanding count of requests per semaphore, wherein each attempted acquirer copies this value on initial attempt and decrements it each time it wakes up. If the number is !0, it goes back to sleep; otherwise, it owns it. This relies on kernel serialization of system calls, so would require a mutex on AIX/Solaris 2.0. If the number is 0, the mutex is acquired, and the call returns to the requestor. This has the effect of guaranteeing FIFO order. The problem of waking up only the one who is next in line for the semaphore can be resolved by allocation of more kernel memory for "queued request elements", and sleeping on the address of your queue element rather than the semaphore list entry (which becomes a queue head pointer). Release removes the element of the releasor from the queue, and wakes up the next entry. This has the problem of requiring kernel memory allocation/dealloc on request/release (unless such memory is statically allocated), and complicates both the release an resource track cleanup by requiring an additional pointer and link list for all queue entries for the necessary reverse lookup. Terry Lambert terry_lambert@gateway.novell.com terry@icarus.weber.edu --- Disclaimer: Any opinions in this posting are my own and not those of my present or previous employers.