home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.object:2976 comp.lang.eiffel:979
- Newsgroups: comp.object,comp.lang.eiffel
- Path: sparky!uunet!psinntp!merlin.hgc.edu!jcm
- From: jcm@hgc.edu (James McKim)
- Subject: Re: How to design a data structure library.
- Message-ID: <1992Jul22.142916.22115@merlin.hgc.edu>
- Sender: usenet@merlin.hgc.edu (Action News Central)
- Organization: The Hartford Graduate Center
- References: <1820@snap> <1992Jul21.124256.17256@bony1.bony.com>
- Date: Wed, 22 Jul 1992 14:29:16 GMT
- Lines: 260
-
- In article <1992Jul21.124256.17256@bony1.bony.com> richieb@bony1.bony.com (Richard Bielak) writes:
- >In article <1820@snap> paj@uk.co.gec-mrc (Paul Johnson) writes:
- >
- >[...]
- >
- >>Cursors
- >>-------
- >>
-
- [Examples of LIST with built in cursor vs LIST with outside iterators
- deleted]
-
- >>
- >>The ISE system has the disadvantage that multiple iterations over a
- >>list are difficult to do. In particular, an interleaved iteration
- >>involving two cursors at different points in the list can be very
- >>tricky.
- >
- >I think that cursors should be separate from the data structure they
- >are used to traverse. The main reason is to allow multiple iterations
- >over the same list. The scheme present in the ISE library (with "mark"
- >and "return") works OK within a single program. It fails, when the
- >data structured is shared - either through a database or concurrent
- >process.
-
- I agree. They also tend to add complexity to the class. And since
- the LIST class plays a role in a variety of other ISE library
- classes (e.g. TREE, WINDOW) it's complexity "contaminates" the
- others. It'll be interesting to see what changes ISE has made
- in these classes in the next release.
-
- [...]
-
- >>
- >>RISC vs CISC
- >>------------
- >>
- >>The ISE data structure classes have large numbers of features. Many
- >>of these provide efficient short cuts. For instance there is a list
- >>merge facility which removes items from one list and inserts them into
- >>another. The ISE linked list implementation allows this to be done
- >>with a little internal pointer shuffling. Other features are rarely
- >>used and do not offer efficiency short-cuts. For instance, the
- >>`remove_all' procedure scans the list and removes all items which are
- >>equal to the argument. This could be done just as quicly by an
- >>explicit loop.
- >>
- >>Inheriting from the ISE list classes can be a trial. In order to
- >>provide a true subtype, one has to implement a large number of
- >>features, many of which are of doubtful utility and others for which
- >>there are no clear semantics or implementation.
- >>
- >>The SIG design has none of this. The data structure features provide
- >>a bare minimum of functionality. This makes the documentation simpler
- >>to use, but some tasks are less efficient than they might be.
- >>
- >
- >I would prefer RISC. I think it was Jim McKim, who said that the
- >features of a class should form a "minimum spanning" set - that is
- >just enough to be complete, but not more.
-
- Yup. A minimum spanning set should be "sufficient" in the sense of
- Booch and, ideally, specifiable via preconditions and postconditions.
- There are other criteria we like to see a minimum spanning set
- satisfy, but these are the main ones. Paul's ideas on "fine-grained"
- inheritance are not unrelated here.
-
- To be fair to ISE, the further up the inheritance hierarchy you go
- the more RISC-like the classes get. These classes tend to be deferred,
- so we users tend to use the classes at the low end of the hierarchy
- which are more MISC-like.
-
- >
- >I've implemented a persistent list class, which did _not_ inherit from
- >LIST. The main reason was that LIST has too many things that I did not
- >want to implement for persistent lists (eg. "mark", "return",
- >"index_of", "duplicate).
-
- I also often use a simpler LIST class of my own devising.
-
- >
- >
- >>
- >>Functional vs Procedural
- >>------------------------
- >
- >[...]
- >>
- >>Things become worse when considering large objects. The ISE SET class
- >>provides `intersect', `merge' and `subtract' procedures. Hence the
- >>only way to create a new set `c' which is the union of two existing
- >>sets `a' and `b' is:
- >>
- >> c.Clone (a);
- >> c.merge (b);
- >>
- >>With some kinds of objects, this would be rather less efficient than
- >>
- >> c := a + b;
- >>
- >>On the other hand providing operators `*' and `+' for intersection and
- >>union would result in something like:
- >>
- >> a := a + b;
- >>
- >>This creates a new (and fairly large) object before throwing away the
- >>old value of `a'. If this is implemented as a linked list then you
- >>could have a huge amount of garbage being generated.
- >>
- >>Does it make sense to implement both kinds of operations in a class,
- >>so that the programmer can pick the appropriate one?
- >>
- >
- >
- >I suppose one should implement only the more fundamental set of
- >operations, and then let the descendant create the "+" or "*"
- >operators if desired. This goes back to the "minimal spanning set"
- >idea.
-
- Our version of the minimal spanning set for a SET class includes
- only Create, Insert an element, Remove an element as commands
- and Cardinality, Has a given element as queries.
-
- We would leave it to descendants to provide either version of merge,
- intersect, and subtract. But this avoids Paul's question.
-
- It seems to me that Union (I think ISE used 'merge' as the verb
- form of union), Intersection and Difference (ditto for 'subtract' here)
- are fundamentally binary operations. That is, they're single valued
- functions of two variables, and so should be designed as such.
- Thus I favor the c := a + b or c := union( a, b ) form. I would
- add union, intersection, and difference as a first increment,
- through inheritance, to the minimal SET class.
-
- Then as a second increment, I would add Merge, Intersect, and
- Subtract. Then document that the typical use for Merge is
-
- a.merge( b ) in place of a := a + b
-
- because the former is more efficient than the latter. Ditto
- for Intersect and Subtract.
-
- >
- >
- >
- >>Documentation
- >>-------------
- >>
- >>What should go into the documentation of a cluster?
- >
- >Good question. There should be a general description of what the
- >classes are doing. For example, the description of data structure
- >cluster should talks about cursors.
-
- Very good question. I would also want to see some examples of
- how the classes interact with each other, but I think we've
- only scratched the surface here.
-
- >
- >>What should be in the documentation of each individual class?
- >
- >At least "flat class | short" :-)
-
- And of course this does you little good unless you've done
- a thorough job with preconditions, postconditions, and invariants,
- at least as comments. Somewhere we need an example of the typical
- use of the class. Maybe there are three levels. Cluster level
- (discussed above), class level (typical use, how features interact, invariants), and feature level (preconditions and postconditions).
-
-
- >
- >>Do we need discussion of time and space complexity? At what level of
- >>detail?
- >
- >I think so. Sometimes a simple guideline on how to use a class
- >efficiently would suffice.
-
- Yup.
-
- >
- >>Do we also need an outline of the implementation method? At what
- >>level of detail?
- >
- >This is probably not needed, as long as the source code is available.
-
- And as long as the information above is available. _Ideally_, if you're
- a client of the class, you shouldn't need the source code unless
- you suspect a bug. If you're inheriting from a class then more
- detailed documentation is certainly necessary. In particular,
- you're much more apt to need the source code.
-
- >
- >
- >>Is the short-flat version of a class useful? (Sig appear to think
- >>not).
- >
- >It's essential.
-
- Absolutely! If there's no easy way to obtain the public interface
- for a class, then much of the benefit of information hiding and
- encapsulation are lost. I don't know what SIG was thinking here.
-
- Richie and I have kicked around the idea of refining short. The default
- version would extract the public interface of features exported to all classes
- (the current version). A second version would be "short for class X" which
- would extract the public interface of features visible to X. Since
- Eiffel supports selective export, this could be a proper superset
- of what the default version gives. If X is a descendant
- of the class in question then all it's features are visible
- and we have the "short-for-descendants" mentioned below.
-
- >
- >>Do we need a separate section: "Guide to Writing Descendants"?
- >
- >I think so. I'm in process of writing docs for a database cluster, and
- >large part of the doc will be on that topic.
- >
- >What would be useful is a "short-for-descendants". An extract of a
- >class that contains the signatures of all the features, without the
- >code.
-
- This kind of documentation and more is going to be needed. Especially
- if vendors begin supplying libraries without source code.
-
- >
- >>What sort of examples are needed?
- >
- >At least an example of typical use. Perhaps some examples that
- >illustrate the difference between efficient and inefficient use of the
- >library.
-
- Yup, and see above.
-
- >
- >
- >>Are inheritance graphs useful?
- >>
- >
- >I don't find these particularly useful, although they are pretty to
- >look at.
- >
-
- I like 'em. I think we at least need what ISE provides in their library
- documentation. They show the complete inheritance hierarchy above the current
- class.
-
- >
- >
- >...richie
- >
- >--
-
- [...]
-
- -- Jim
- --
-
- *------------------------------------------------------------------------------*
- Jim McKim (203)-548-2458 In exactly 3.2 seconds it will be a few
- Internet: jcm@hgc.edu minutes to 5:00.
-