home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.parallel
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!sdd.hp.com!ncr-sd!ncrcae!hubcap!fpst
- From: grout@sp90.csrd.uiuc.edu (John R. Grout)
- Subject: Re: Wanted: efficient software lock
- Message-ID: <1992Dec17.225025.3179@csrd.uiuc.edu>
- Keywords: mutual exclusion, memory models, locks
- Sender: news@csrd.uiuc.edu
- Reply-To: j-grout@uiuc.edu
- Organization: UIUC Center for Supercomputing Research and Development
- References: <1992Dec15.134502.7384@hubcap.clemson.edu> <liv2dcINN6c5@exodus.Eng.Sun.COM>
- Date: Thu, 17 Dec 92 22:50:25 GMT
- Approved: parallel@hubcap.clemson.edu
- Lines: 48
-
- agn@bovic.Eng.Sun.COM (Andreas G. Nowatzyk) writes:
- >It might be of interest to note that the above algorithm (and many similar
- >ones) assume that memory operations are sequentially consistent (SC). There
- >is a significant performance advantage for systems that provide SC memory
- >only when needed. These weaker memory models generally require ordering or
- >synchronization instructions to appear SC.
-
- >For example, if Peterson's algorithm were used on a Sparc based
- >multiprocessor, without using an atomic instruction (swap, test-and-set or
- >compare-and-swap), you would still need ordering instructions. In the current
- >version 8, this means adding a store-barrier instruction *and* a "swap" that
- >is used as a load-barrier (it is not possible to implement this particular
- >algorithm without "swap" or "ldstub" on V8 machines). Eventually, in the next
- >version of the Sparc architecture, a generalized ordering directive (membar)
- >can be used to implement Peterson's Algorithm correctly in all supported
- >memory models (TSO, PSO and RMO).
-
- When one adds performance considerations (e.g., cache utilization, cache
- coherence and memory bus traffic) to memory consistency considerations (which
- determine whether software locking schemes implemented with ordinary
- instructions _can_ work), I believe that all software locking schemes should
- be comparatively inefficient even with the use of a "membar" instruction...
- both getting a word atomically on a system which was the last to update it and
- spinning on an unavailable word should be better optimized in atomic
- operations than in an implementation of Peterson's algorithm (at least in any
- chip I would want to build a shared-memory MP with).
-
- Also, in a non-atomic memory system with instructions like "membar", atomic
- instructions may be weakened so that they only guarantee the consistency of
- synchronization variables... one might use one technique (e.g., atomic
- instructions, Peterson's algorithm with "membar") to control a critical region
- _and_ an instruction like "membar" to guarantee the memory consistency of
- non-synchronization variables used within it. In an operating system
- environment, one frequently wants to update a shared variable (e.g., a
- counter) without entering a critical region... so if one had an atomic
- operation to attempt this without synchronizing other memory operations, it
- would gain.
-
- I don't know how much consistency the existing atomic instructions will
- provide on a Sparc once "membar" is available... I would guess that, due to
- compatibility issues, what they do now would be preserved... maybe Sparc will
- implement weaker versions of atomic operations?
-
- --
- John R. Grout j-grout@uiuc.edu
- University of Illinois, Urbana-Champaign
- Center for Supercomputing Research and Development
-
-