home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.c++
- Path: sparky!uunet!news.mentorg.com!bcannard
- From: bcannard@hppcb36.mentorg.com (Bob Cannard @ PCB x5565)
- Subject: Re: Garbage Collection for C++
- Originator: bcannard@hppcb36
- Sender: news@news.mentorg.com (News User)
- Message-ID: <1992Aug15.074911.7965@news.mentorg.com>
- Date: Sat, 15 Aug 1992 07:49:11 GMT
- References: <1992Aug6.014619.2111@ucc.su.OZ.AU> <DAVEG.92Aug14194411@synaptx.synaptics.com>
- Nntp-Posting-Host: hppcb36.mentorg.com
- Organization: Mentor Graphics
- Keywords:
- Followup-To:
- Lines: 176
-
-
- In article <DAVEG.92Aug14194411@synaptx.synaptics.com>, daveg@synaptics.com (Dave Gillespie) writes:
- |> In article <1992Aug14.021547.15215@news.mentorg.com> bcannard@hppcb36.mentorg.com (Bob Cannard @ PCB x5565) writes:
- |> > In article <DAVEG.92Aug13025629@synaptx.synaptics.com>, daveg@synaptics.com (Dave Gillespie) writes:
- |> > |> Couldn't we get away with having a garbage collector that didn't
- |> > |> need pointers to be declared `GC'? The only thing that would need
- |> > |> special `GC' treatment would be "new" [...]
- |> >
- [my comments deleted]
- |>
- |> The "GCnew" approach means you're marking objects (rather than classes
- |> or pointers) as GC-able. It's still true that non-GC-allocated objects
- |> would work just like they always did.
- |>
- [my comments deleted]
- |>
- |> So you use "GCnew" to allocate objects of those classes, and "new"
- |> elsewhere.
-
- I have two problems with this.
-
- The first is that the GC is forced to check every pointer to see if it might
- point to a GC object, and the compiler is required to provide the necessary
- information that identifies all these pointers. The GC is forced to check many
- more pointers than is necessary (I estimate that in my application, roughly
- 90% of the objects would be non-GC). The programmer knows damn well that these
- pointers are irrelevant to the GC, and I get very annoyed when I can't pass
- that information on to the compiler.
-
- The second is that having to test for GCness at run time represents an
- additional overhead for the GC which I'd rather avoid.
-
- In fact, I'm forced to wonder: if the GC has to figure out the difference
- at run time, would it not be more efficient to simply make everything GCable?
- It looks to me as if there is no point in having GC and nonGC coexist, unless
- the two can be distinguished at compile time.
-
- |> The "extra treatment" that GC classes would need basically involves
- |> adding some kind of tag information. If you declare an entire class
- |> to be GC, it makes sense to put such a tag in every instance of the
- |> class. If you instead work by allocating regular classes in a
- |> special GC-able way, then it makes more sense to put the tag only
- |> in instances of the class that are in the GC heap.
-
- That looks wrong to me. Suppose I have a pointer to some object. The GC
- comes along and finds this pointer. It must first determine whether or not
- the object is a GC object or a non-GC object. It must then find the tag. If
- the object is part of an array, the tag could be anywhere. How the hell does
- the GC figure out where the tag is? How does it even tell that the object is
- inside an array? The object could contain an index to show which element of the
- array it is, but it would be cheaper to tag each element individually.
-
- |> I still think declaring pointers to be GC would be too much of a
- |> hassle for the programmer
-
- ???? You've got to be kidding! You can cope with const (or haven't you got
- cfront 3.0 yet - heh heh - evil chortle) but not with GC? If you can't cope
- with hassle, you're in the wrong language! :-)
-
- |> , so we're left having to assume all
- |> pointers may point to GC-allocated objects. Without adding lots of
- |> tags in places that make C compatiblity awkward, we're left with a
- |> completely "conservative", i.e., brute-force GC algorithm. We scan
- |> every word on the stack, every word in static storage, and every
- |> word on the non-GC heap for things that look like GC pointers.
- |> (Since this is slow, a good optimization might be to separate the
- |> static area and possibly the non-GC heap into things-which-contain-
- |> pointers and things-which-don't, like massive arrays of doubles
- |> which would be a waste of time to scan.)
-
- Probably worse than the v*rd*mmt reference counting that I'm already using,
- especially if the GC part of the data structure is relatively small. Blech!
- You really want all that inescapable overhead in order to save a bit of hassle?
-
- [interesting stuff deleted for brevity]
- |> To the user this scheme would feel exactly like C++ does now, with
- |> no loopholes as far as I can tell. The only change is that you
- |> are allowed to say "GCnew" instead of "new", on any kind of data,
- |> in which case that data gets deleted automatically as soon as there
- |> are no pointers of any kind to it left. Very clean, very simple
- |> (at least where it counts---to the C++ user).
-
- What counts is the *end user*, the customer, the person who buys the
- program. They don't care if it's clean and simple, they care that it does
- the job, does it fast, and doesn't blow up. They notice when programs get
- slower, and they complain. If necessary, by buying from a competitor.
-
- [well-deserved and thorough trashing of reference counting deleted, along
- with other interesting stuff]
-
- |> And, "no passing pointers around to Xlib, because Xlib was written
- |> before GC pointers were invented."
-
- Ahem. This is where the "promise not to do anything clever" stuff comes in.
- Just like string.h was invented before const, all it needed was to update the
- _declarations_: the _definitions_ could still be in C, Mongolian Cobol
- or whatever, and would not have to change. If a function can't make that
- promise, you can't give it a GC pointer. Them's the breaks.
-
- |> You might be able to squeak out of it by noting that "printf" is not
- |> going to invoke a GC since only "GCnew" calls can do that, and "printf",
- |> being non-GC-aware, will not call "GCnew".
-
- Printf can make the promise for exactly the reasons you give.
-
- |> The only two remaining
- |> tricky points then are existing functions with callbacks to code that
- |> can do a GC (like X toolkits, say), and signal handlers that invoke
- |> a GC (which might have to be forbidden for all sorts of other reasons).
-
- The callback problem might be another side of the promise. The signals
- I'm not sure about, but how do they differ from, say, doing a "new" or
- "delete" in a signal handle, which might be invoked while the program
- is in the middle of a "new" or "delete"? Lots to think about here.
-
- [DANGER shears at work]
-
- [comments about GC objects referring to ordinary ones]
-
- |> This is the old when-are-temporaries-deleted issue. At least there it
- |> can be resolved in some definitive way, by specifying in the language
- |> when temporaries are deleted (say, only on statement boundaries or
- |> whatever). With GC you don't have this luxury because GC's happen so
- |> unpredictably. (Especially if signal handlers can invoke GC!)
-
- I think it only becomes a serious problem if a GC object refers to a non-GC
- object which refers to a GC object which... The non-GC object must not be
- deleted before the GC object gets collected, otherwise the GC is going to get
- soundly screwed even if it is conservative; yet the programmer has no way
- of ensuring that the GC object has already gone. Alternatively, the GC object's
- destructor deletes the non-GC object, and the programmer holds on to a pointer
- to the non-GC object until it's safe to get rid of it. Hmmm...
-
- |> > |> What if a pointer to a GC-able object is passed to foreign (non-C++)
- |> > |> code, which stashes it somewhere the C++ collector doesn't know to
- |> > |> look? (All garbage-collecting languages have this problem, and as
- |> > |> far as I know all they can do is shrug and warn programmers not to
- |> > |> let go of the last C++ reference to an object if they know there
- |> > |> might be non-C++ references still lurking around.)
- |>
- |> > Hence the proposal for tagging GC-able objects, and having something
- |> > comparable to "const" which declares "I will not do any clever tricks
- |> > with this pointer, including converting it to an integer, performing
- |> > pointer arithmetic on it, storing it, etc". This only needs a change
- |> > to the C++ declarations of library functions, not to the functions
- |> > themselves. IMO this makes it feasible to mix GC and libraries.
- |>
- |> I don't see how either tagging or declaring things helps here. I
- |> didn't mean "what if some novice incorrectly stashes away a pointer
- |> to a GC object," I meant "what if someone *needs* to pass a pointer
- |> to a GC object to a function which will stash it away where the GC
- |> can't find it."
-
- I wasn't talking about novices either, I was talking about detecting situations
- where things have gone wrong, and informing the programmer so that corrective
- action can be taken. To be honest, with the availability of static,
- automatic, and non-GC allocated data structures, I'm not sure that this
- will ever be an insurmountable barrier, because of the ability to make a
- permanent copy, _provided_ the compiler can give warning when it happens.
-
- If there *are* situations where it is unavoidable, we are going to need a
- locking mechanism. I'll hold off judgement on that for the time being.
-
- [snip]
- |> -- Dave
- |> --
- |> Dave Gillespie
- |> daveg@synaptics.com, uunet!synaptx!daveg
- |> or: daveg@csvax.cs.caltech.edu
-
- Thanks, Dave, interesting stuff. Here's hoping that all this discussion will
- eventually get hammered into concrete proposals for a viable system!
- --
- bob_cannard@mentorg.com "Human beings? ... Well, I suppose they are a
- form of life, even if they are unspeakable"
- Exprssed opinions are not necessarily those of Mentor Graphics Corporation.
-