home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!usc!sdd.hp.com!hp-cv!ogicse!das-news.harvard.edu!spdcc!iecc!compilers-sender
- From: chased@rbbb.Eng.Sun.COM (David Chase)
- Newsgroups: comp.compilers
- Subject: Re: Pros and cons of high-level intermediate languages
- Keywords: translator, design
- Message-ID: <92-07-069@comp.compilers>
- Date: 21 Jul 92 18:01:44 GMT
- Article-I.D.: comp.92-07-069
- References: <92-07-064@comp.compilers>
- Sender: compilers-sender@iecc.cambridge.ma.us
- Reply-To: chased@rbbb.Eng.Sun.COM (David Chase)
- Organization: Sun Microsystems, Mt. View, Ca.
- Lines: 89
- Approved: compilers@iecc.cambridge.ma.us
-
- ssp@csl36h.csl.ncsu.edu (Santosh Pande) writes:
- > I am interested in knowing the pros and cons of using an
- >intermediate language (IL) in general. In particular I find 'C' has been
- >used extensively as the IL in many situations: Modula, SISAL, AT&T Cfront
- >for C++ etc.
-
- > Some of the reasons I can see in favor of such an approach are:
- > (1) Ease of portability (a feature of C that made it so popular!),
-
- yes.
-
- > (2) Easy retargettability (one can patch the run-time support with a
- >customized library for the given target architecture easily), and,
-
- yes.
-
- > (3) Relative ease of mapping a given intermediate form (IF) to C's
- >data structures.
-
- so-so. The closer the original language is to C, the better. As soon as
- you have a non-C concept (e.g, continuations, or guaranteed tail-call, or
- garbage collection, or threads) things begin to get a bit rougher. Note
- that there are existence proofs demonstrating that these things can be
- done, but the non-C concepts are a lot harder to translate.
-
- Note well that my experience (Modula-3) and the experience of friends who
- have done similar things indicates that you should dive to a fairly low
- level in the generated C. In particular, you should consider generating
- "cast-ful" code, unless you have an extremely good command of exactly what
- happens when you combine types of various flavors in arithmetic.
-
- > However, such an approach might also suffer from:
- > (1) Debugging is hellish,
-
- It's not as bad as you might think (for the compiler-writer, that is).
- This depends largely on you, and largely on the language that you are
- compiling -- for Modula-3, since all pointer types were tagged, we were
- able to generate P_<typename> subroutines that did an excellent job of
- formatting data structures. We tinkered with insertion of #line
- directives (back to the M3 source), and sometimes it succeeded
- wonderfully, but too often it would be subtly misleading at just the wrong
- time. I tried to not mangle variable names excessively, and that helped.
-
- > (2) Efficiency is dictated to a large degree by the target C
- >Compiler,
-
- Yes and no. This depends lots on the code that you generate. I took the
- approach (for Olivetti M3) that we couldn't rely on the optimizer because
- we were using garbage collection and exception handling (implemented with
- setjmp and longjmp), and so I took great care in the code that I
- generated.
-
- > (3) Translation in some situations might involve clumsy data
- >structures and thus loss of efficiency.
-
- It's not clear what you mean here. Compile-time efficiency? (it sucks,
- generally) Run-time efficiency? (it depends.)
-
- > Now my questions:
- > (1) I am looking for examples in which using C as IL will lead to
- >inefficiencies,
-
- Aliasing analysis. You already know about Fortran. Other places where
- this can happen include reference to pieces of descriptors that you "know"
- won't change because they are part of your run-time. You can get around
- this by pre-loading everything that won't change into a local variable,
- but this could lead to a Big Surprise for the target C compiler. ("Big
- Surprise" means potentially slow compilation and/or slow execution, or
- (worse) overflowed internal tables.)
-
- Also, the lack of (portable) register globals, or (portable) lightweight
- access to thread-local storage can screw you up. If you want to write a
- compacting garbage collector, the intermediate C compiler introduces
- uncertainty as to just where the pointers are stored in the activation
- records, and what has been done to them. There are other clever tricks
- that you might want to try (special code generation for critical sections,
- special code generation for exception handling) that are completely
- off-limits if you use C as an IL.
-
- > (3) I want to learn about the efforts to evolve efficient ILs
- >(just like IFs) for procedural languages.
-
- What's the difference between an IL and an IF?
-
- David Chase
- Sun
- --
- Send compilers articles to compilers@iecc.cambridge.ma.us or
- {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.
-