home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.bsd
- Path: sparky!uunet!spool.mu.edu!darwin.sura.net!ra!tantalus.nrl.navy.mil!eric
- From: eric@tantalus.nrl.navy.mil (Eric Youngdale)
- Subject: Re: Shared lib benchmarks, and experiences
- Message-ID: <BzAEnE.GKq@ra.nrl.navy.mil>
- Sender: usenet@ra.nrl.navy.mil
- Organization: Naval Research Laboratory
- References: <1992Dec12.235116.7484@rwwa.COM> <1giendINNgku@life.ai.mit.edu> <1992Dec14.231025.12627@rwwa.COM>
- Distribution: usa
- Date: Tue, 15 Dec 1992 06:14:01 GMT
- Lines: 207
-
- In article <1992Dec14.231025.12627@rwwa.COM> witr@rwwa.com writes:
- >| > 1) The two librarys must have identical ``assigned'' addresses, and
- >| > 2) The two librarys must be substantially identical.
- >|
- >| The first point is correct. I should point out that there is no reason
- >| why we cannot have two different libraries assigned to the same address - you
- >| just will not be able to use both at the same time in the same process.
- >
- >The reason why I bring this up is that I suspect that it is difficult to
- >assign ``compatible'' ``assigned'' addresses except in the case where the
- >libraries are ``substantially identical''. For example, if the latter library
- >has twice as many entrypoints as the former, this is likely to be a difficult,
- >problem and probably has no general solution.
-
- Offhand, I do not see why the number of entry points represents a
- problem. The general structure of a sharable library under linux is that we
- start with the jump table itself. This is usually padded to some large size to
- accomodate future expansion. Directly after this comes the global data, and
- everything is fixed at specific addresses. After this comes the regular text
- and data sections. As long as you do not overfill the section of memory that
- was set aside for jump vectors, there will not be a problem. The first time
- you build a sharable library, you select how much memory you want for the jump
- vectors - a wise maintainer will always allow a lot of room for future
- expansion if there is *any* possibility that they will be needed. There is no
- reason a priori that we even need to use up the jump vector slots in any
- particular order. We could use them randomly if we wanted to, although it
- would serve no useful purpose to do so.
-
- >
- >| The second one depends upon how you define "substantially".
- >[...]
- >| As a rule of thumb, as long as you can provide identical
- >| assigned addresses, you can generate plug compatible libraries. The
- >| limitations have less to do with the design of the library itself, but have
- >| more to do with the tools that we have available to ensure that the various
- >| addresses remain the same from one version to the next.
- >
- >This is the caveat that worries me. How do you handle the following
- >situations?
- >
- > 1) Second library has (many) more entrypoints than former library.
-
- This I have already discussed above.
-
- > 2) The ordering of the entrypoints in the objects is different.
-
- We have complete freedom to select whatever ordering of entry points
- that we wish when we first build the library. If there are two libraries that
- are supposed to share the same interface, then you simply have to provide
- identical lists of functions to the program that generates the jump vector
- module.
-
- > 3) There is changed inter- and intra-calling relationships between
- > routines in this and other libraries.
-
- An example of this might be our X libraries. Naturally the X libraries
- require various functions in libc, and since the X libraries are linked to the
- sharable jump-table version of libc, we can simply replace libc if there is a
- new version, and the sharable X libraries will automatically use the new
- version. Inter-calling (calls within the library) are all resolved by the
- linker without having to even look at the jump table. If the inter-calling
- relationships change, it will not be a problem, as long as the external
- interface remains fixed (i.e. the jump vectors and global data all remain at
- the same address).
-
- There are some things that could break a sharable library, such as a
- change in the number of arguments for a given function. In the past we have
- treated this in the following fashion: We leave the older N-argument function
- in the library with it's jump vector in the jump-table, but we fix things so
- that anytime we link to the library, the linker will only see the new N+1
- version of the function. The N+1 version of the function has it's own distinct
- slot in the jump table, so there is never confusion about which function we are
- talking about. Naturally, the header file changes at the same time we change
- the library. The advantage of doing this is that we can allow a gradual
- changeover to the new way of doing things without suddenly breaking a lot of
- different programs all at one time. After a suitable period of time, and
- perhaps after some warnings have been posted, the old version of the function
- will be deleted and the jump slot would be changed to point to a routine that
- would simply tell you that you must recompile. We did something similar when
- we went from 16 bit inode numbers to 32 bit inode numbers in the stat
- structure (yet another minixism that bit the dust).
-
- We do this kind of stuff to avoid breaking peoples binaries, but
- it is a bit of a nuisance to do this kind of thing. The more mature
- the library is to begin with, the better the chance that you will never have to
- even worry about this sort of thing. I am not sure at all how easy or clean it
- would be to try and treat this sort of situation with dynamic linking.
-
- > 4) What about run-time library loading, as is done with resolvers
- > on SVR4.
-
- I do not know how SVR4 does it (even though I use it at work). The way
- it is handled under linux is that there is a special data element in the binary
- which contains the following bits of info:
-
- 1) The full path of the library to be loaded.
- 2) An ascii string which is more descriptive that the pathname.
- 3) The version number of the sharable library.
- 4) The virtual address at which it should be loaded.
-
- This is spotted by crt0 (in this respect, it is similar to a global constructor
- under c++), and it basically does some checking (i.e. it makes sure that the
- version number of the library linked against is consistent with the version
- number of the library found at the pathname, and that the virtual address that
- we are requesting the library be loaded be the same as what the library itself
- wants to be loaded). I am not sure, but I think that it simply amounts to some
- kind of mmap, and the pages are demand loaded as required. If you wanted to
- know for sure, you would have to ask Linus about this.
-
- >
- >| As I recall the biggest drawbacks to the dynamic linking were
- >| the need for a new assembler and linker, the need for more extensive kernel
- >| mods, larger binaries and more overhead to load a program.
- >
- >Let's handle these in turn.
- >
- >1) Need for new assembler and linker: If you mean that you need a compilation
- >system that can generate PI code, then yes, you need these. Since the GCC
- >system generates PI code, I don't see why this is a problem.
-
- The compiler is not the problem. The assembler, gas, does not
- understand (yet) the special syntax that GCC generates when writing PI code.
- Out of curiousity I tried compiling something with PIC, and I got gobs of
- assembler errors. As I recall, this was probably the most formidable stumbling
- block, although in retrospect we probably could have solve the problem by
- running some kind of postprocessor on the assembly code. We are also using the
- GNU ld, and depending upon how you do the implementation, changes may have to
- be made here as well.
-
- There was another objection that has been raised in the past by various
- people, and that is that in the 3/486 architecture there are relatively few
- machine registers compared to something like a VAX. The PI code that I have
- seen GCC generate always seems to use up one register as a reference pointer of
- some kind or another, and when you reserve this register (usually ebx) for this
- purpose, it is not available to the compiler for other uses, and this could
- lead to poorer performance. I have not seen any numbers to back this up,
- but the objection has been raised.
-
- >If you mean that you have to extensively modify the compilation system
- >in other ways, this is not correct. You can handle all the needed functions
- >in the CRTL startup code. You may want to have the linker do other things
- >for efficiency reasons, but it is not otherwise required.
-
- Ah, yes, but we probably would want to have the linker do other things
- for efficiency reasons - if you were to compare a quick and dirty dynamic
- linking implementation to the linux style fixed-address libraries, the fixed
- address libraries would come out looking quite good indeed. In a proper
- implementation of dynamic linking we would probably want the list of external
- symbols arranged in such a way that they take up a minimum amount of space in
- each binary and in such a way that the externals are easy to resolve quickly.
- If efficiency were no concern, we could probably just use the output from "ld
- -r" and build a mini-linker into crt0 to finish the job.
-
- >2) Kernel mods. Dynamic shared libs can be done without kernel mods
- >depending on how code space is protected. Or you can use a mmap primitive
- >to speed things up. Or you can add additional kernel code to make it
- >all more efficient. Extensive kernel mods are not *required*.
-
- I had of course forgotten that the linking could be done by crt0.
- Nonetheless, there is some programming involved, either in the kernel or in
- crt0 before you can start to use dynamic linking.
-
- >
- >3) Larger binaries: Not significantly, and, perhaps, not at all. It depends
- >on the details. This should be weighed against the benefits.
-
- I doubt that there would be any binaries that would be no larger with
- dynamic linking, but I have no doubt that you could achieve something where
- the additional space was not very much.
-
- >4) More overhead to load a program. This also depends on the details.
- >On my SVR4 system the additional time varys depending on whether the library
- >has already been accessed by another process. For X programs, which access
- >about a dozen shared librarys, the time seems to be swamped by other factors,
- >such as widget creation. I don't notice it.
-
- Again, I don't know the grungy details on how things work under SVR4.
- I use it at work, and it seems fast enough to me, so it is obviously possible
- to do dynamic linking in a workable way. The question always boils down to the
- tradoffs involved, and to what tools need to be developed in order to implement
- one scheme or the other. The biggest technical obstacle at the time for us was
- probably the assembler, although I think that we probably would have wanted to
- muck with the linker as well for efficiency. There were some people who felt
- that we should try and use the off-the-shelf as and ld from FSF instead of
- trying to maintain our own variant version.
-
- There was a fairly long debate about the whole thing, and in the end we
- realized that it would not be that tough to implement the fixed address type of
- libraries. Compatibility from one version to the next has always been the hard
- part about this type of implementation, and this is where we have been spending
- most of our effort to refine the process. In contrast, with dynamic linking I
- would imagine that most of the refinement would be in making it efficient,
- since version to version compatibility is relatively easy to provide once you
- had a basic operating principle that is functional.
-
- Anyway, we have been refining the concept for about 6 months, and we
- now have it to a point where the drawbacks are quite minimal. Given the proper
- tools it is not that tough to actually build a sharable jump-table type of
- library, although it may be true that is is a little easier to generate a
- dynamic linking type of library instead (this depends a lot on the
- implementation as well). If we had decided to go with dynamic linking in one
- way or another, we would have probably needed to spend more time upfront before
- we would have gotten anything out the door.
-
- -Eric
- --
- Eric Youngdale
-