NetNews Usenet Archive 1992 #19

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #19 / NN_1992_19.iso / spool / comp / compiler / 1509 < prev next >

Wrap

Internet Message Format | 1992-09-02 | 14.0 KB

Path: sparky!uunet!wupost!udel!gvls1!faatcrl!iecc!compilers-sender From: matthew@ntl02.uk.tele.nokia.fi (Matthew Faupel) Newsgroups: comp.compilers Subject: Re: Dealing with Multiple Inheritence Keywords: OOP, design Message-ID: <92-09-028@comp.compilers> Date: 3 Sep 92 00:12:42 GMT References: <92-09-015@comp.compilers> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: matthew@ntl02.uk.tele.nokia.fi (Matthew Faupel) Organization: Nokia Telecommunications, Cambridge, UK Lines: 285 Approved: compilers@iecc.cambridge.ma.us Ed Cynwrig Dengler writes: > I am trying to find references to articles that deal with the problems of > dealing with and representing multiple inheritence in a compiled program. > I am also interested in any articles that deal with how large typical > object-oriented classes get (ie. how many features/routines does a typical > class define, how many classes are typically inherited from, how deep the > inheritence tree gets, how often a feature is renamed). > The only reference I have currently is "An Object Addressing Mechanism for > Statically Typed Languages with Multiple Inheritence" by R.C.H. Conner, > A. Dearle, R. Morrison, and A.L. Brown. > The reason for this request is that I am doing a project evaluating > modification of the "map" system presented in the above article, and would > like to compare the original "map" system and the modified one to any > other methods out there. ... > I am trying to find answers to the following questions: > a) What other schemes are there? > b) How is multiple inheritence handled (this includes what restrictions > the scheme forces on multiple inheritence)? > c) If an object is assigned to a variable that is of an inherited class > type, how is the object changed to allow correct addressing of the > fields/features/routines (eg. an object x of class D is assigned > to variable y of class C, how is y.c handled)? > d) Approximately how long would it take to do the assignment and/or > find a given feature in a variable? > e) How much space does the cheme take up? Bjarne Stroustrup's paper on the implementation of the C++ multiple inheritance scheme can be found in: UNIX System V, AT&T C++ Language System Release 2.0 Selected Readings. Select Code 307-144 though I don't have the original paper identification. In addition, when I first came across Eiffel (as described in Bernard Meyer's "Object Oriented Software Construction") I was intrigued by the claim that multiple inheritance could be implemented with a fixed time being taken for looking up methods and attributes of an object. As a result I had a go at devising schemes for implementing the Eiffel inheritance system for which this was true. The following two schemes are what I came up with; I'm afraid I haven't read the reference that you give so I don't know whether either of them is the "map" system you mention or whether they are a duplication of already well known work, but here they are anyway. If anyone has any comments, they'd be appreciated. I considered using the class scheme you gave in your message as the example, however there are three details it doesn't give: o whether the attributes are data attributes or methods, o if the attributes are methods, which (if any) are redefined, and o which is the main line of descent (i.e. if both B and C redefine a feature of A, D inherits both, and then a D is assigned to a reference of type A, which of B or C's attribute would be used?) So the example below (given in (possibly slightly incorrect) Eiffel) gives all of these; it also illustrates the Eiffel (2.x) trick of merging back together features that arrive by two different paths from a root class, but still under the same name; in this example, A.b and A.g are never renamed so D inherits just one version of each; A.a and A.f are renamed and so D inherits one copy from B and another from C. class A feature a: INTEGER; b: INTEGER; f: INTEGER is do Result := 1 end; g: INTEGER is do Result := 2 end end; class B inherit A rename a as Ba, f as Bf redefine Bf feature Bf: INTEGER is do Result := 3 end end; class C inherit A rename a as Ca, f as Cf redefine Cf feature Cf: INTEGER is do Result := 4 end end; class D inherit B, C -- D now effectively has: -- Ba: INTEGER; -- Ca: INTEGER; -- b: INTEGER; -- Bf: INTEGER is ... end; -- Cf: INTEGER is ... end; -- g: INTEGER is ... end -- In Eiffel 2.x there is no way of indicating which parent takes priority, -- but we'll assume for this example that B does i.e. -- anA: A; !D!anA; anA.f; -- will end up calling Bf. end; Both schemes follow the same basic pattern: for each class there is a class descriptor set up by the compiler which contains pointers to the methods for that class and offsets to the data attributes from the start of any object of that class. Objects of that class contain the data attributes of the object plus a pointer to the class descriptor. This works fine for single inheritance however for multiple inheritance, something a bit more tricksy is required. What the class descriptor becomes is a set of the above type of descriptor; one for each route that can be found from a common ancestor to the current class (sort of). To give a more concrete example, if A is the descriptor for class A; B the extra information needed over and above A to describe B and so on, then the descriptors for A, B, C and D are: A: A B: AB C: AC D:ABACD ^ The trick is that when an object of type D is being referred to through a reference of type A or B (main line) the whole descriptor is used; when it is referred to through a descriptor of type C (or of type A after being assigned through a sequence such as: aC = aD; anA = aC) the subsidiary part of the descriptor is used. This begs the question, how do we know to use the subsidiary part? This is where the two solutions diverge. In solution a) each reference to an object is made up of a pointer to the object and an offset into the class structure; the object itself contains a pointer to the class structure and this added to the offset provides the correct part to use. In solution b) the reference is a pointer alone, but each object has as many pointers as there are subparts to the class and the reference points to the appropriate pointer in the object. The following shows the memory setup under the two schemes when an object of type D is allocated and then referred to through two references, one of type C and the other of type D. A class header of size 2 is assumed. a) Reference Object data Class data Method code aD: +-------+ 0+-------+ 0+-------+ |Ptr: ^------+----->| ^------------->|Class | +-------+ | 1+-------+ |Header | |Off: 2 | | |Ba: ? | 2+-------+ +-------+ +-------+ | 2+-------+ |OBa: 1 | /------>| Bf | | |b: ? | +-------+ | | Code | aC: +-------+ | 3+-------+ |Ob: 2 | | +-------+ |Ptr: ^------/ |Ca: ? | +-------+ | +-------+ +-------+ |Bf: ^----/ +-------+ |Off: 6 | +-------+ /--->| g |<-\ +-------+ |g: ^-------/ | Code | | 6+-------+ +-------+ | |OCa: 3 | | +-------+ +-------+ | |Ob: 2 | /------>| Cf | | +-------+ | | Code | | |Cf: ^----/ +-------+ | +-------+ | |g: ^-----------------------/ +-------+ The code (in C ignoring casts) for some example actions using this object: aD.b -> *(aD.Ptr + *(*aD.Ptr+aD.Off+CLASS_D_B_OFF)); aD.g -> *(**aD.Ptr+aD.Off+CLASS_D_G_OFF) (aD.Ptr,aD.Off+D_TO_A_CAST); aD.Cf -> *(**aD.Ptr+aD.Off+CLASS_D_CF_OFF) (aD.Ptr,aD.OFF+D_TO_C_CAST); aC := aD -> aC.Ptr = aD.Ptr; aC.Off = aD.Off+D_TO_C_CAST; aC = aD -> aC.Ptr == aD.Ptr; The parameter to the function calls is a "self" reference of the type the method is expecting passed as a pointer and offset. The various constants used can be determined at compile time and are: #define CLASS_D_B_OFF 3 // Offset of b's offset in D's class data #define CLASS_D_G_OFF 5 // Offset of pointer to g in D's class data #define CLASS_D_CF_OFF 8 // Offset of pointer to Cf in D's class data #define D_TO_A_CAST 0 // Ref.Off change required to make a D an A #define D_TO_C_CAST 4 // Ref.Off change required to make a D a C Advantages (compared to solution b) Reference comparison is easier. Assignment is easier. New object initialisation is quicker. Object points to class header => easy to determine type of object without storing extra info. Potentially less space is taken by each object, however as references take more and many objects contain references, this advantage is probably lost. Disadvantages References require 2 values => using registers is more tricky => code is not as efficient. Extra work is required to resolve attributes (due to adding in the offset from the reference). b) Reference Object data Class data Method code aD: +-------+ 0+-------+ 0+-------+ | ^-------------->| ^----------\ |Class |<-\ +-------+ 1+-------+ | |Header | | /---->| ^-------\ | 2+-------+ | | 2+-------+ | \-->|H: ^----+ | |Ba: ? | | +-------+ | aC: +-------+ | 3+-------+ | |O: 0 | | | ^---------/ |b: ? | | +-------+ | +-------+ 4+-------+ | |OBa: 2 | | +-------+ |Ca: ? | | +-------+ | /--->| Bf | +-------+ | |Ob: 3 | | | | Code | | +-------+ | | +-------+ | |Bf: ^-------/ | +-------+ | +-------+ | |g: ^----------->| g |<-\ | 8+-------+ | | Code | | \----->|H: ^----/ +-------+ | +-------+ | |O: 1 | +-------+ | +-------+ /--->| Cf | | |OCa: 3 | | | Code | | +-------+ | +-------+ | |Ob: 2 | | | +-------+ | | |Cf: ^-------/ | +-------+ | |g: ^-----------------------/ +-------+ The code (in C ignoring casts) for some example actions using this object: aD.b -> *(aD + *(*aD+CLASS_D_B_OFF)); aD.g -> *(**aD+CLASS_D_G_OFF) (aD+D_TO_A_CAST); aD.Cf -> *(**aD+CLASS_D_CF_OFF) (aD+D_TO_C_CAST); aC := aD -> aC = (aD?aD+D_TO_C_CAST:0) aC = aD -> aC == (aD?aD+D_TO_C_CAST:0) The various constants used can be determined at compile time and are: #define CLASS_D_B_OFF 5 // Offset of b's offset in D's class data #define CLASS_D_G_OFF 7 // Offset of pointer to g in D's class data #define CLASS_D_CF_OFF 12 // Offset of pointer to Cf in D's class data #define D_TO_A_CAST 0 // Pointer change required to make a *D a *A #define D_TO_C_CAST 1 // Pointer change required to make a *D a *C Advantages (compared to solution a) Attribute resolution is easier. A reference is a single value, which is easier to use efficiently. Disadvantages Pointers to the same object can have different values. The pointer doesn't neccesarily point to the head of the object, so extra information has to be stored in the class data to allow the object head to be found (the O field is the offset from the head of the object). The pointer(s) to the class data from the object do not point to the head of the class, hence extra information (the pointer H) is needed to determine where the class data head is. Object initialisation time increases as the number of routes to a class through the inheritance tree increases (though this can be mitigated by setting up a template object and block copying it). General advantages of both methods Minimum space is used to store attributes (i.e. no unnecessary duplication of attributes when a parent is inherited twice). Code is shared where appropriate, not duplicated. The following inheritance features are supported: o Multiple inheritance (including from the same class more than once) o Renaming and redefining o Only those multiply included parent features that need to be duplicated are duplicated o Replacement of a method by an attribute is possible (though I'll leave that as an exercise for the reader) o The ability to define the main line of descent when inheriting from the same class more than once (I believe Eiffel 3 supports this on a per attribute basis; I don't think the above schemes can handle that, though I can think of a tacky modification to make them do so :-) Apologies for the long post; hope someone finds this interesting! Matthew -- matthew@uk.tele.nokia.fi *--- matthew@ntl02.decnet.nokia.fi *------------------* Matthew Faupel *--- -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.