home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.lang.scheme
- Path: sparky!uunet!stanford.edu!snorkelwacker.mit.edu!bloom-beacon!INTERNET!dont-send-mail-to-path-lines
- From: jmiller@crl.DEC.COM
- Subject: What I learned from building Dylan/Thomas
- Message-ID: <9209142230.AA20554@peanut.crl.dec.com>
- Sender: daemon@athena.mit.edu (Mr Background)
- Reply-To: JMiller@crl.dec.com
- Organization: The Internet
- Date: Mon, 14 Sep 1992 22:30:40 GMT
- Lines: 221
-
- I've been maintaining an unintentional "radio silence" for the last
- several weeks while working with Ron Weiss and Matt Birkholz to build
- the Thomas system. Now that I'm out of that hole, I'd like to update
- others on some of what I learned in the process and respond to a few
- of the questions raised earlier on these mailing lists.
-
- I. Despite all of the hairy stuff involving keywords, specializers,
- multiple inheritance and the rest of it, the generic dispatch system
- of Dylan (maybe also CLOS?) was remarkably simple to build and
- understand. The key procedure is in the file generic.scm, and is
- simplified here (I removed support for multiple values, error
- checking, and renamed some functions for clarity):
-
- (define (generic-dispatch original-args generic-function)
- (let ((applicable-methods
- (sorted-applicable-methods
- (generic-function.methods generic-function)
- original-args)))
- (let next-method-loop ((remaining-methods applicable-methods)
- (current-args original-args))
- (apply (car remaining-methods)
- (if (null? (cdr remaining-methods))
- #F
- (lambda (next-method . these-args)
- (next-method-loop (cdr remaining-methods)
- (if (null? these-args)
- current-args
- these-args))))
- current-args))))
-
- This requires one non-trivial support routine:
-
- sorted-applicable-methods finds the applicable methods and sorts
- them from most- to least-specific. We used a topological sort of
- all of the classes in the system. This is used to provide an
- ordering of the classes for comparing with individual arguments to
- the method, which seems logical enough:
-
- (define (match-specializer? object specializer)
- ;; Computes a distance value from the object to the specializer.
- ;; This distance is either
- ;; #F -- doesn't match the specializer
- ;; 0 -- the specializer is a singleton for exactly this object
- ;; <n> -- ordering of the class representing this specializer in
- ;; the ordering of all classes (always >= 1)
- (if (singleton? specializer)
- (if (eq? object (singleton.object specializer)) 0 #F)
- (if (subclass? (get-type object) specializer)
- (class.generality specializer)
- #F)))
-
- We then used lexicographic ordering to break ties for multiple
- argument dispatch, which is specified in Dylan but seems basically
- arbitrary. This latter is a place where I'd expect a Scheme
- system to say "unspecified".
-
- I won't reproduce the intervening code, but if you look for a while
- you'll see that sorted-applicable-methods requires two things:
-
- a) A way, given a method under consideration, to find the
- specializers that restrict the types of arguments to which it can
- be applied. In response to one of the questions on the Scheme
- list, this is where the need for WEAK-CONS arises. Since methods
- are first-class in Dylan (Dylan/method <=> Scheme/lambda) and can
- be attached at any time to a generic function, it is necessary at
- runtime to be able to locate the specializers for a method. This
- requires building a data structure (an a-list or hash table, for
- example) that uses methods as keys to locate their specializers.
- But such a table would grow indefinitely in size unless the
- garbage collector removes entries that correspond to methods that
- have been eliminated from the system. [There is a similar need
- to locate a description of a generic function at runtime, used
- for the error detection mechanism I eliminated from this
- simplified code.]
-
- b) A mechanism for determining the most specific type of an object
- at runtime. The procedure "get-type" (in the file
- runtime-top.scm) performs the operation. It is a kludge and very
- order sensitive. I found this function rather unpleasant to
- maintain as the system grew:
-
- (define (get-type obj)
- (cond ((instance? obj) (instance.class obj))
- ((number? obj) ; Might be wrong
- (if (real? obj)
- (if (exact? obj)
- (if (integer? obj)
- <integer>
- <ratio>)
- <float>)
- <complex>))
- ((class? obj) <class>)
- ((singleton? obj) <singleton>)
- ((null? obj) <empty-list>)
- ((slot? obj) <slot-descriptor>)
- ((pair? obj) <pair>)
- ((vector? obj) <simple-object-vector>)
- ((string? obj) <byte-string>)
- ((char? obj) <character>)
- ((procedure? obj)
- (cond ((dylan::generic-function? obj) <generic-function>)
- ((dylan::method? obj) <method>)
- (else <function>)))
- ((keyword? obj) <keyword>)
- ((symbol? obj) <symbol>)
- (else <object>)))
-
- This observation has lead me back to a conversation we had about
- the "portable record proposal" which went unresolved at our last
- meeting. The issue revolved around the pair of requirements for
- disjointness of types and opacity of representation. Several of
- us discussed this at some length at the LISP conference and I
- came away convinced of the need for a set of operations, separate
- from the record package, for manufacturing disjoint user types.
- At the Authors meeting, Jonathan Rees pointed out that he had
- made such a proposal some time ago but that it had not been acted
- upon. I'd like to have us act on it: Jonathan, can you find it
- and resubmit it?
-
- II. On the question about teaching generic dispatch as opposed to
- object dispatch, I find myself for the first time disagreeing with
- Brian Harvey. While I haven't actually tried to teach this material
- for two years now, I seem to recollect that I taught numbers very
- early in the course. They are the one place in Scheme where generic
- operation is deeply built into the system, and it provides a simple
- and logical place to start when explaining generic dispatch. The
- symbolic algebra system at the end of Chapter 2 of Structure and
- Interpretation takes this much farther, but perhaps at the expense of
- the gut-level understanding of the issue that comes from simply asking
- the question "How does Scheme do that with its numbers, anyway?". The
- very simple example of adding integers to real numbers explains a lot
- to anyone familiar with representations. For those unfamiliar with
- numerical representations, the introduction of generic sequence
- operations (like the ones in Dylan, but simplified) is another obvious
- starting point. I find the INITIAL-STATE, NEXT-STATE, CURRENT-ELEMENT
- iteration protocol very elegant (although my taste would add
- CURRENT-KEY to all collections, not just explicit key sequences; even
- at the cost of an extra cons-cell for the state of a list).
-
- I, personally, have always found the object dispatch method both
- limiting as a programming style and consequently rather unpleasant to
- teach. Its main advantage, from what I've seen so far, is one of
- modularity. The class heterarchy is conceptually elegant and works
- well, but unfortunately leads to a number of questions about where an
- object gets its behavior from, which inevitably leads to questions
- about HOW the object was implemented; this is precisely the
- abstraction violation that I thought object systems were intending to
- avoid.
-
- III. I've written my first large, portable Scheme program. It
- required exactly two extensions to the language (WEAK operations, a
- minimal error system), but is modularized in a way that makes the port
- easy. I like this way of writing my programs, and I think it gives me
- a new way to think about the issue of portability. The Scheme
- language has the flexibility to allow most reasonable programs to be
- divided into an implementation-independent part which calls procedures
- defined in a implementation-dependent part. This isn't new; but this
- is the first time I've actually tried to make it work and it did! The
- ease of porting this system from one Scheme to another contrasts
- rather sharply with my attempts to port MIT Scheme's implementation
- from one version of C to another, and to similar experiences in
- porting Pascal and BCPL programs in an earlier existence.
-
- IV. To the Authors:
-
- I'd really like to see some hooks into the garbage collector
- standardized. This implementation experience, as well as a paper I
- wrote several years ago that touches on some of the other things one
- can do using a garbage collector, convinced me that it isn't going to
- be that hard to figure out "the right set". I see two "must do" items
- and two optional ones:
-
- MUST DO 1: Have a way to specify a set of "precious" items which
- the garbage collector is not at liberty to remove, but which it
- will attempt to monitor and report when they are no longer
- required for any other reason. The two optional sets are
- specific implementations of this idea which exist in several
- current Scheme implementations, but which may not be the best
- abstraction of the service.
-
- MUST DO 2: Have a way to explicitly invoke the equivalent of the
- garbage collector's structure-preserving walk given an arbitrary
- starting object. One form this could take is a procedure that
- takes the root object and a vector; it stores into the vector all
- the items found from walking that object and returns a count of
- all of the objects encountered (this allows the operation to
- occur (conceptually, at least) without CONSing). An alternate
- formulation is to provide a function which takes an object and
- returns a collection of the sub-components of that object
- sufficient to allow a full structure walk from source-level
- Scheme.
-
- OPTIONAL 1: Have a way to run a procedure at the end of every
- garbage collection before returning to user code. This is how
- the semi-portable code in Thomas implements populations and one
- dimensional "weak" tables on top of WEAK-CONS. It is one way to
- perform the report function in MUST DO 1, but not the only
- possible way. All three of the Schemes I used for Thomas (MIT,
- Scheme->C, and Gambit) have some such hook, but all have slightly
- different names and signatures for the function that installs the
- GC daemon(s).
-
- OPTIONAL 2: The WEAK-CONS set of operations (WEAK-PAIR?, WEAK-CONS,
- WEAK-C{A,D}R, SET-WEAK-C{A,D}R!) are now also available in all
- three Scheme systems. It took Joel Bartlett only one day to add
- them to his conservative generational collector; they are nice
- because of their simplicity. They lack the generality of MUST DO
- 1, and would require something like the finalization proposal
- that was circulated before our last meeting (but not discussed).
-
- I'd also like to see some people take a good, hard look at the Thomas
- code for generic operations and the class hierarchy. It is very
- portable and provides some very good capabilities. I'd like someone
- to take an independent look at that code and see if it can be packaged
- nicely and put into the (putative) Scheme library. If we can do that,
- we should be able to add a lot of the Thomas runtime system to the
- same Scheme library and have a very good start at a powerful
- programming library -- one thing that (IMHO) has long held up the
- adoption of Scheme as a "real" language.
-
- --Jim
-