home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.c++
- Path: sparky!uunet!decwrl!parc!boehm
- From: boehm@parc.xerox.com (Hans Boehm)
- Subject: Re: Garbage Collection for C++
- Message-ID: <boehm.714847027@siria>
- Sender: news@parc.xerox.com
- Organization: Xerox PARC
- References: <DAVEG.92Aug17224359@synaptx.synaptics.com> <1992Aug20.213720.16599@microsoft.com> <boehm.714410546@siria> <1992Aug25.185458.10190@microsoft.com>
- Date: 26 Aug 92 16:37:07 GMT
- Lines: 90
-
- jimad@microsoft.com (Jim Adcock) writes:
-
- [Discussion of allocation costs for long-lived objects with
- generational collection omitted.]
-
- >My accounting is simply that the cost of keeping an object alive is
- >not the cost of either allocating or deallocating the object. Before you
- >cry "foul" please note that non-moveable-object memory management schemes
- >*also* have a cost of keeping objects alive -- namely their non-locality
- >of reference leads to costs of cache misses, page turns, and ultimately
- >thrashing on heavily loaded systems. Such costs need to be factored into
- >the cost of any memory management scheme -- but estimating such costs is
- >extremely variable depending on a given system and how that system is loaded.
- >Whereas the costs of allocating and deallocating can be more reasonably
- >estimated.
-
- (I think we've had this discussion before, but ...)
- I agree that the nonlocality costs are hard to estimate. In fact, I
- don't believe that they always favor the moving collector. At least some
- nonmoving allocators do a reasonable job of keeping consecutively
- allocated objects close together. There is some evidence that this is
- a better strategy than keeping things in breadth-first traversal order,
- as many copying collectors do.
-
- Perhaps more importantly, given the ratio of disk access times to cpu
- speeds on most modern machines, the real question often is "how long can
- you run without any significant paging?". Copying collectors tend not to
- do very well by this measure, since a full collection touches more VM than
- any nonmoving scheme is likely to. Malloc/free (with or without reference
- counting) probably wins by this criterion, with compacting Mark-sweep
- collection scoring a close second. (It wins as a result of less fragmentation,
- but needs some empty space in the heap to prevent GC thrashing.)
-
- At the level of cache performance, there is also no clear winner.
- Malloc/free has the big advantage of recycling objects sooner, thus
- making it likely that a newly allocated objects is already cached, and
- making it less likely that something else will need to be flushed as the
- result of the allocation. I would guess that this usually dominates the
- results of fragmentation etc., but I'm not sure. It's conceivable that
- you could make the youngest generation of a copying collector small enough
- to get the same effect, at least if the root set is small enough. But this
- can cause excessive promotion rates, at least for small cache sizes. I
- don't know of any existing collectors that really achieve this for say
- a 32K cache.
-
- >|Generational (esp. copying) collectors win dramatically on short-lived
- >|objects, provided you have some efficient way of tracking stores
- >|into old objects. Fortunately most objects fall into this category
- >|for most programs.
- >|
- >|(Ben Zorn has some measurements that substantiate this for some
- >|interesting C programs. It is clearly the case for all known
- >|Lisp and Smalltalk implementations.)
-
- >I've found surprisingly similar results on a limited number of C++ programs
- >I've tested.
-
- >|A generational copying collector is probably the worst possible choice if
- >|all your program allocates are large long-lived bitmaps (and indeed
- >|generational copying collectors usually treat such things as a special
- >|case).
-
- >But such cases can already be well handled in C++ using reference counting
- >schemes. Again, I see a Modula-3 like approach where classes of objects
- >are either GC or not GC -- and the GC class objects are movable. Backwards
- >compatibility is maintained because adding GC'ed classes is a pure extension.
-
- My problem with this is that the two worlds don't mix well, making it
- hard to use old libraries with new code.
-
- In the case of long-lived
- bitmaps, I would prefer to allocate the bitmap off to the side, with
- a garbage collected header object that deallocates the bitmap in its
- finalizer (or destructor, or whatever). That probably works fine. I
- was only using the example to argue against a pure copying collector
- scheme.
-
- >Moveable objects need to be supported not merely for GC, but for more
- >interesting problems of distributed networks, persistence, ODBMS, etc.
-
- Agreed. Support for movable objects is a good thing. But it also has
- a very substantial cost, in that lots of compilers and libraries need to be
- fixed. Some (nonportable but useful) programming paradigms break (e.g.
- hashing pointers no longer works. You need an extra counter field per
- allocated object if you want to be able to hash.) There is a cost (of
- one kind or another) in maintaining enough descriptor information.
-
- Hans-J. Boehm
- (boehm@parc.xerox.com)
-
-