home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!synaptx!synaptics.com!daveg
- From: daveg@synaptics.com (Dave Gillespie)
- Newsgroups: comp.lang.c++
- Subject: Re: Garbage Collection for C++
- Message-ID: <DAVEG.92Aug27002517@synaptx.synaptics.com>
- Date: 27 Aug 92 07:25:17 GMT
- References: <DAVEG.92Aug17224359@synaptx.synaptics.com> <boehm.714158406@siria>
- <DAVEG.92Aug20022559@synaptx.synaptics.com>
- <1992Aug25.183619.9541@microsoft.com>
- Sender: daveg@synaptx.Synaptics.Com
- Organization: Synaptics, Inc., San Jose, CA
- Lines: 128
- In-reply-to: jimad@microsoft.com's message of 25 Aug 92 18:36:19 GMT
-
- In article <1992Aug25.183619.9541@microsoft.com> jimad@microsoft.com (Jim Adcock) writes:
-
- > In article <DAVEG.92Aug20022559@synaptx.synaptics.com> daveg@synaptics.com (Dave Gillespie) writes:
- > |Do you really think incremental or generational collectors are
- > |feasible in C++?
-
- > Generational GC is certainly feasible in C++. In practice one needs compiler
- > support for the implied "smart pointers", so that programmers don't have
- > to remember to use special pointer syntax everywhere.
-
- (And in a later message Jim continues...)
-
- > But such cases can already be well handled in C++ using reference counting
- > schemes. Again, I see a Modula-3 like approach where classes of objects
- > are either GC or not GC -- and the GC class objects are movable. Backwards
- > compatibility is maintained because adding GC'ed classes is a pure extension.
- > Moveable objects need to be supported not merely for GC, but for more
- > interesting problems of distributed networks, persistence, ODBMS, etc.
-
- Suppose you have a vector class which you wish to be garbage-collectable.
- A typical vector class includes a pointer to a C-style array of data.
- Question: Is this array also allocated as a collectable object, or
- is it allocated the old-fashioned way (deleted explicitly by the
- vector object's destructor)?
-
- If the underlying array is not collectable, then you are still plagued
- with dangling reference problems: The program drops all its references
- to the vector itself, but retains a reference to the i'th element of the
- array. Since this element is in storage unrelated to the vector as far
- as the GC is concerned, the vector gets deleted and you have a dangling
- pointer. It would be really, really nice to use GC to sidestep this
- problem. Every third posting in the destruction-of-temporaries thread
- in this group ends with "if only we had GC so we could solve this
- problem once and for all..."
-
- If the underlying array *is* collectable, then it must be of a collectable
- type. That means it's not just class types that must be able to be
- collectable, but *all* C++ types. C's array type is slippery enough
- already---are you sure you can pin a "collectable" property on it
- without breaking too much else?
-
- If only class objects could be collectable, then only pointers to those
- classes would need special, slow pointer-dereferencing semantics with
- a generational collector. If everything is collectable, you'd probably
- have to slow down pointer dereferencing for everything---not acceptable!
-
- I still haven't seen anybody propose a way to do generational GC without
- having magic pointers (at least not without falling back on magic memory
- systems, which not all hardware can provide).
-
-
- Also, if you allow GC to move your objects, then you need a 100%
- foolproof way for the GC to identify exactly the set of words in memory
- which represent pointers to the moved objects. You can't use the
- so-called "conservative" GC algorithms, because they can't tell the
- difference between a legitimate pointer and an integer that happens
- to look like a pointer.
-
- Without conservative GC, you need RTTI for everything (including
- struct types which must be structurally compatible with C structs,
- unless you disallow pointer to GC objects in any struct that doesn't
- have any C++ features). Your compiler needs to record enough
- information to allow the GC to figure out where the pointers are
- in every stack frame. You need to initialize all pointers all the
- time. An optimizing compiler will need to record a table showing,
- for every instruction in the program, which registers contain
- pointers and which contain integers at that moment. All of these
- things carry unavoidable penalties that all code must pay for,
- whether it actually uses GC or not. Also, existing libraries must
- be recompiled to include all this information.
-
- Moving data during GC is only useful when you have arrays of data.
- When you have individual class objects, you generally have lots of
- things the of same size and simple per-size free lists are likely to
- be good enough, or at least malloc-style coalescing of adjacent free
- areas.
-
- You could allow the GC to move only arrays of non-class objects
- which are declared specially as movable and which are accessed
- only through specially-declared pointers, but my guess is this
- would still be too much hassle to be worthwhile.
-
-
- > C++ needs to support moveable objects in any case, in order to handled
- > ODBMS, persistence, locality of reference, etc. If you ever have done
- > store/restore of objects from moveable store, you've had to handle the
- > problem of moveable objects. This is a standard programming problem not
- > well handled by current C++. It needs compiler support. Immovable object
- > GC schemes do not solve this problem -- they simply leave it to someone else
- > to solve.
-
- C++ does not "need" to support any of these things. A proposal for
- adding GC to C++ has to be justifiable on its own grounds or there's
- no hope of it ever being adopted. Locality of reference, for example,
- would be nice to optimize but it's not "required." If there's nothing
- that can be done about it without movable objects, and movable objects
- are not practical in C++ for the reasons outlined above, then I guess
- there's no choice but to put locality of reference on the back burner.
-
- Besides, storing and restoring objects is much different from just
- letting the GC move objects around in RAM, if I understand your terms
- correctly. Store/restore happens at certain predictable times and the
- pointers being adjusted are likely to need identifying anyway; their
- very representation is likely to be changing. GC is harder since it
- can happen at more or less any time, and can involve pointers all
- throughout an active program. Store/restore is harder in a different
- way because of those changing representations.
-
- There are plenty of features of other languages like Lisp that would
- be really nice to have in C++. Arbitrary-size integers, for example.
- But I would be last to say that C++'s "int" type "needs" to be
- extended in this way; it would be nice, but it's just not practical
- for this language. When I need arbitrary-size integers I use a
- special class, or I use Lisp. If I need sophisticated GC, I'll
- probably also have to go to Lisp. The question is whether there is
- a simple way to add a modest, unobtrusive GC that will be good
- enough for the vast majority of cases.
-
- Stroustrup has warned against making the language topheavy with
- features. Simple, conservative non-moving GC can be added with
- minimal effect to the rest of the language, and minimal danger of
- toppling it.
-
- -- Dave
- --
- Dave Gillespie
- daveg@synaptics.com, uunet!synaptx!daveg
- or: daveg@csvax.cs.caltech.edu
-