NetNews Usenet Archive 1992 #20

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #20 / NN_1992_20.iso / spool / comp / lang / scheme / 2187 < prev next >

Wrap

Text File | 1992-09-14 | 11.4 KB | 233 lines

Newsgroups: comp.lang.scheme Path: sparky!uunet!stanford.edu!snorkelwacker.mit.edu!bloom-beacon!INTERNET!dont-send-mail-to-path-lines From: jmiller@crl.DEC.COM Subject: What I learned from building Dylan/Thomas Message-ID: <9209142230.AA20554@peanut.crl.dec.com> Sender: daemon@athena.mit.edu (Mr Background) Reply-To: JMiller@crl.dec.com Organization: The Internet Date: Mon, 14 Sep 1992 22:30:40 GMT Lines: 221 I've been maintaining an unintentional "radio silence" for the last several weeks while working with Ron Weiss and Matt Birkholz to build the Thomas system. Now that I'm out of that hole, I'd like to update others on some of what I learned in the process and respond to a few of the questions raised earlier on these mailing lists. I. Despite all of the hairy stuff involving keywords, specializers, multiple inheritance and the rest of it, the generic dispatch system of Dylan (maybe also CLOS?) was remarkably simple to build and understand. The key procedure is in the file generic.scm, and is simplified here (I removed support for multiple values, error checking, and renamed some functions for clarity): (define (generic-dispatch original-args generic-function) (let ((applicable-methods (sorted-applicable-methods (generic-function.methods generic-function) original-args))) (let next-method-loop ((remaining-methods applicable-methods) (current-args original-args)) (apply (car remaining-methods) (if (null? (cdr remaining-methods)) #F (lambda (next-method . these-args) (next-method-loop (cdr remaining-methods) (if (null? these-args) current-args these-args)))) current-args)))) This requires one non-trivial support routine: sorted-applicable-methods finds the applicable methods and sorts them from most- to least-specific. We used a topological sort of all of the classes in the system. This is used to provide an ordering of the classes for comparing with individual arguments to the method, which seems logical enough: (define (match-specializer? object specializer) ;; Computes a distance value from the object to the specializer. ;; This distance is either ;; #F -- doesn't match the specializer ;; 0 -- the specializer is a singleton for exactly this object ;; <n> -- ordering of the class representing this specializer in ;; the ordering of all classes (always >= 1) (if (singleton? specializer) (if (eq? object (singleton.object specializer)) 0 #F) (if (subclass? (get-type object) specializer) (class.generality specializer) #F))) We then used lexicographic ordering to break ties for multiple argument dispatch, which is specified in Dylan but seems basically arbitrary. This latter is a place where I'd expect a Scheme system to say "unspecified". I won't reproduce the intervening code, but if you look for a while you'll see that sorted-applicable-methods requires two things: a) A way, given a method under consideration, to find the specializers that restrict the types of arguments to which it can be applied. In response to one of the questions on the Scheme list, this is where the need for WEAK-CONS arises. Since methods are first-class in Dylan (Dylan/method <=> Scheme/lambda) and can be attached at any time to a generic function, it is necessary at runtime to be able to locate the specializers for a method. This requires building a data structure (an a-list or hash table, for example) that uses methods as keys to locate their specializers. But such a table would grow indefinitely in size unless the garbage collector removes entries that correspond to methods that have been eliminated from the system. [There is a similar need to locate a description of a generic function at runtime, used for the error detection mechanism I eliminated from this simplified code.] b) A mechanism for determining the most specific type of an object at runtime. The procedure "get-type" (in the file runtime-top.scm) performs the operation. It is a kludge and very order sensitive. I found this function rather unpleasant to maintain as the system grew: (define (get-type obj) (cond ((instance? obj) (instance.class obj)) ((number? obj) ; Might be wrong (if (real? obj) (if (exact? obj) (if (integer? obj) <integer> <ratio>) <float>) <complex>)) ((class? obj) <class>) ((singleton? obj) <singleton>) ((null? obj) <empty-list>) ((slot? obj) <slot-descriptor>) ((pair? obj) <pair>) ((vector? obj) <simple-object-vector>) ((string? obj) <byte-string>) ((char? obj) <character>) ((procedure? obj) (cond ((dylan::generic-function? obj) <generic-function>) ((dylan::method? obj) <method>) (else <function>))) ((keyword? obj) <keyword>) ((symbol? obj) <symbol>) (else <object>))) This observation has lead me back to a conversation we had about the "portable record proposal" which went unresolved at our last meeting. The issue revolved around the pair of requirements for disjointness of types and opacity of representation. Several of us discussed this at some length at the LISP conference and I came away convinced of the need for a set of operations, separate from the record package, for manufacturing disjoint user types. At the Authors meeting, Jonathan Rees pointed out that he had made such a proposal some time ago but that it had not been acted upon. I'd like to have us act on it: Jonathan, can you find it and resubmit it? II. On the question about teaching generic dispatch as opposed to object dispatch, I find myself for the first time disagreeing with Brian Harvey. While I haven't actually tried to teach this material for two years now, I seem to recollect that I taught numbers very early in the course. They are the one place in Scheme where generic operation is deeply built into the system, and it provides a simple and logical place to start when explaining generic dispatch. The symbolic algebra system at the end of Chapter 2 of Structure and Interpretation takes this much farther, but perhaps at the expense of the gut-level understanding of the issue that comes from simply asking the question "How does Scheme do that with its numbers, anyway?". The very simple example of adding integers to real numbers explains a lot to anyone familiar with representations. For those unfamiliar with numerical representations, the introduction of generic sequence operations (like the ones in Dylan, but simplified) is another obvious starting point. I find the INITIAL-STATE, NEXT-STATE, CURRENT-ELEMENT iteration protocol very elegant (although my taste would add CURRENT-KEY to all collections, not just explicit key sequences; even at the cost of an extra cons-cell for the state of a list). I, personally, have always found the object dispatch method both limiting as a programming style and consequently rather unpleasant to teach. Its main advantage, from what I've seen so far, is one of modularity. The class heterarchy is conceptually elegant and works well, but unfortunately leads to a number of questions about where an object gets its behavior from, which inevitably leads to questions about HOW the object was implemented; this is precisely the abstraction violation that I thought object systems were intending to avoid. III. I've written my first large, portable Scheme program. It required exactly two extensions to the language (WEAK operations, a minimal error system), but is modularized in a way that makes the port easy. I like this way of writing my programs, and I think it gives me a new way to think about the issue of portability. The Scheme language has the flexibility to allow most reasonable programs to be divided into an implementation-independent part which calls procedures defined in a implementation-dependent part. This isn't new; but this is the first time I've actually tried to make it work and it did! The ease of porting this system from one Scheme to another contrasts rather sharply with my attempts to port MIT Scheme's implementation from one version of C to another, and to similar experiences in porting Pascal and BCPL programs in an earlier existence. IV. To the Authors: I'd really like to see some hooks into the garbage collector standardized. This implementation experience, as well as a paper I wrote several years ago that touches on some of the other things one can do using a garbage collector, convinced me that it isn't going to be that hard to figure out "the right set". I see two "must do" items and two optional ones: MUST DO 1: Have a way to specify a set of "precious" items which the garbage collector is not at liberty to remove, but which it will attempt to monitor and report when they are no longer required for any other reason. The two optional sets are specific implementations of this idea which exist in several current Scheme implementations, but which may not be the best abstraction of the service. MUST DO 2: Have a way to explicitly invoke the equivalent of the garbage collector's structure-preserving walk given an arbitrary starting object. One form this could take is a procedure that takes the root object and a vector; it stores into the vector all the items found from walking that object and returns a count of all of the objects encountered (this allows the operation to occur (conceptually, at least) without CONSing). An alternate formulation is to provide a function which takes an object and returns a collection of the sub-components of that object sufficient to allow a full structure walk from source-level Scheme. OPTIONAL 1: Have a way to run a procedure at the end of every garbage collection before returning to user code. This is how the semi-portable code in Thomas implements populations and one dimensional "weak" tables on top of WEAK-CONS. It is one way to perform the report function in MUST DO 1, but not the only possible way. All three of the Schemes I used for Thomas (MIT, Scheme->C, and Gambit) have some such hook, but all have slightly different names and signatures for the function that installs the GC daemon(s). OPTIONAL 2: The WEAK-CONS set of operations (WEAK-PAIR?, WEAK-CONS, WEAK-C{A,D}R, SET-WEAK-C{A,D}R!) are now also available in all three Scheme systems. It took Joel Bartlett only one day to add them to his conservative generational collector; they are nice because of their simplicity. They lack the generality of MUST DO 1, and would require something like the finalization proposal that was circulated before our last meeting (but not discussed). I'd also like to see some people take a good, hard look at the Thomas code for generic operations and the class hierarchy. It is very portable and provides some very good capabilities. I'd like someone to take an independent look at that code and see if it can be packaged nicely and put into the (putative) Scheme library. If we can do that, we should be able to add a lot of the Thomas runtime system to the same Scheme library and have a very good start at a powerful programming library -- one thing that (IMHO) has long held up the adoption of Scheme as a "real" language. --Jim