home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.c++
- Path: sparky!uunet!decwrl!parc!boehm
- From: boehm@parc.xerox.com (Hans Boehm)
- Subject: Re: Garbage Collection for C++
- Message-ID: <boehm.714326151@siria>
- Sender: news@parc.xerox.com
- Organization: Xerox PARC
- References: <1992Aug6.014619.2111@ucc.su.OZ.AU> <DAVEG.92Aug13025629@synaptx.synaptics.com> <TMB.92Aug16164940@arolla.idiap.ch> <1992Aug18.021453.24394@news.mentorg.com> <DAVEG.92Aug17224359@synaptx.synaptics.com> <boehm.714158406@siria> <DAVEG.92Aug20022559@synaptx.synaptics.com>
- Date: 20 Aug 92 15:55:51 GMT
- Lines: 62
-
- daveg@synaptics.com (Dave Gillespie) writes:
- >Do you really think incremental or generational collectors are
- >feasible in C++? I've never looked in the guts of one of these
- >collectors, but as I understand it they tend to depend on the
- >ability to move data around, and other things that a C++ collector
- >might not be able to get away with. (C++'s use of pointers is so
- >unrestricted that it's a minor miracle it's possible to do garbage
- >collection at all...)
-
- Traditionally these techniques have been used with copying collectors.
- (My impression is the issue of copying collection and C++ hasn't been
- resolved yet. Bartlett's mostly copying collector for C++ is one
- possible solution. Another is to use special compilers that insist on
- safety. I personally don't think either of these is optimal in the
- short term, but ...)
-
- There is a very easy way to build a two generation mark-sweep collector.
- The most recent generation consists of those objects that were allocated
- since the last collection. (This is known to be suboptimal, but sometimes
- good enough.) To collect the most recent generation, you leave the mark
- bits set from the previous collection. You then mark from roots (stack etc.)
- and from already marked objects that were potentially modified.
-
- This gets you most of the pause time and locality advantages of copying
- generational collection. It doesn't usually get you much of a cpu time
- improvement, and it may actually cost you a little in cpu time. The problem
- is that you can't easily confine the most recent generation to a mostly
- empty section of the heap. But for moderate allocation rates, pause times
- and locality tend to be most of the issue.
-
- The glitch is that you need to be able to identify potentially modified
- objects. Under some versions of UNIX, you can coerce this information
- out of the OS (with page level granularity, which is usually good enough).
- Under many other systems, it's very hard to get a hold of. PCR currently
- does this under SunOS 4.X, but it works only because PCR is in a position
- to intercept all system calls. It's supposed to be much easier under
- Solaris 2.0.
-
- If you can't get the information out of the VM system, you need to modify
- the compiler so that it explicitly tracks pointer writes into the heap
- (at a minimum). Note that these are exactly the same constraints as for
- copying generational collectors.
-
- Once you have this facility, it's also possible to build incremental or
- parallel noncopying collectors. (In fact nonmoving parallel collectors
- are easier than copying parallel collectors, in my opinion. You don't
- have to worry about the collector munging the clients data structures
- while it's looking at them. You still need to worry about the converse
- problem.) The details are in
-
- Boehm, Demers, and Shenker,
- ``Mostly Parallel Garbage Collection'',
- PLDI '91, June 1991 SIGPLAN Notices.
-
- In my experience, you don't really want to bother with any of this
- unless your heaps get reasonably large relative to your CPU speed.
- PCR does use all of the above techniques, since it supports
- the Cedar programming environment, which usually has more
- than 10 MB live data, and sometimes much more than that.
-
- Hans-J. Boehm
-
-