NetNews Usenet Archive 1992 #30

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #30 / NN_1992_30.iso / spool / comp / parallel / 2768 < prev next >

Wrap

Text File | 1992-12-21 | 3.3 KB | 63 lines

Newsgroups: comp.parallel Path: sparky!uunet!zaphod.mps.ohio-state.edu!sdd.hp.com!ncr-sd!ncrcae!hubcap!fpst From: grout@sp90.csrd.uiuc.edu (John R. Grout) Subject: Re: Wanted: efficient software lock Message-ID: <1992Dec17.225025.3179@csrd.uiuc.edu> Keywords: mutual exclusion, memory models, locks Sender: news@csrd.uiuc.edu Reply-To: j-grout@uiuc.edu Organization: UIUC Center for Supercomputing Research and Development References: <1992Dec15.134502.7384@hubcap.clemson.edu> <liv2dcINN6c5@exodus.Eng.Sun.COM> Date: Thu, 17 Dec 92 22:50:25 GMT Approved: parallel@hubcap.clemson.edu Lines: 48 agn@bovic.Eng.Sun.COM (Andreas G. Nowatzyk) writes: >It might be of interest to note that the above algorithm (and many similar >ones) assume that memory operations are sequentially consistent (SC). There >is a significant performance advantage for systems that provide SC memory >only when needed. These weaker memory models generally require ordering or >synchronization instructions to appear SC. >For example, if Peterson's algorithm were used on a Sparc based >multiprocessor, without using an atomic instruction (swap, test-and-set or >compare-and-swap), you would still need ordering instructions. In the current >version 8, this means adding a store-barrier instruction *and* a "swap" that >is used as a load-barrier (it is not possible to implement this particular >algorithm without "swap" or "ldstub" on V8 machines). Eventually, in the next >version of the Sparc architecture, a generalized ordering directive (membar) >can be used to implement Peterson's Algorithm correctly in all supported >memory models (TSO, PSO and RMO). When one adds performance considerations (e.g., cache utilization, cache coherence and memory bus traffic) to memory consistency considerations (which determine whether software locking schemes implemented with ordinary instructions _can_ work), I believe that all software locking schemes should be comparatively inefficient even with the use of a "membar" instruction... both getting a word atomically on a system which was the last to update it and spinning on an unavailable word should be better optimized in atomic operations than in an implementation of Peterson's algorithm (at least in any chip I would want to build a shared-memory MP with). Also, in a non-atomic memory system with instructions like "membar", atomic instructions may be weakened so that they only guarantee the consistency of synchronization variables... one might use one technique (e.g., atomic instructions, Peterson's algorithm with "membar") to control a critical region _and_ an instruction like "membar" to guarantee the memory consistency of non-synchronization variables used within it. In an operating system environment, one frequently wants to update a shared variable (e.g., a counter) without entering a critical region... so if one had an atomic operation to attempt this without synchronizing other memory operations, it would gain. I don't know how much consistency the existing atomic instructions will provide on a Sparc once "membar" is available... I would guess that, due to compatibility issues, what they do now would be preserved... maybe Sparc will implement weaker versions of atomic operations? -- John R. Grout j-grout@uiuc.edu University of Illinois, Urbana-Champaign Center for Supercomputing Research and Development