home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!mintaka.lcs.mit.edu!ai-lab!zurich.ai.mit.edu!jinx
- From: jinx@zurich.ai.mit.edu (Guillermo J. Rozas)
- Newsgroups: comp.arch
- Subject: Re: Threaded Code
- Message-ID: <JINX.92Aug14171357@rolex.ai.mit.edu>
- Date: 14 Aug 92 21:13:57 GMT
- References: <55535@mentor.cc.purdue.edu> <13910@goanna.cs.rmit.oz.au>
- <GLEW.92Aug7185506@pdx007.intel.com>
- <1992Aug10.141731.10007@email.tuwien.ac.at> <martin.713808550@bert>
- Sender: news@ai.mit.edu
- Reply-To: jinx@zurich.ai.mit.edu
- Organization: M.I.T. Artificial Intelligence Lab.
- Lines: 51
- In-reply-to: martin@math.rwth-aachen.de's message of 14 Aug 92 16:09:10 GMT
-
- In article <martin.713808550@bert> martin@math.rwth-aachen.de ( Martin Schoenert) writes:
-
- | Date: 14 Aug 92 16:09:10 GMT
- | From: martin@math.rwth-aachen.de ( Martin Schoenert)
- |
- | So I wrote my own tests. First I defined a virtual (stack) machine with
- | 28 instructions (not just 2 as in Anton's tests). For this virtual
- | machine I wrote a small test program, which computes the number of primes
- | less than 40000 in a very straightforward way. Then I wrote and tested
- | the following 6 implementations of this virtual machine.
- |
- | ...
- |
- | If a compiler performed those optimizations the implementations of
- | bytecode and direct threaded interpreters such as 'swi' and 'sub' would
- | be just as efficient as the implementations using gcc's label pointers.
- | Such optimizations might also be worthwile in other situations where
- | label pointers would not help at all.
-
- Actually, I'm not convinced. I don't know what your bytecode is like,
- but I can tell you that our interpreter lost a factor of 2 in speed
- when recoded in C (and compiled with gcc) from the original 68k
- version.
-
- One of the major sources of this performance difference was not the
- instruction dispatch, which after all, a switch can do moderately
- well, and where you are inherently dispatching on a compact type code,
- but on two other places where first-class labels would be useful (we
- used addresses of instructions in machine language), namely:
-
- - "return addresses" into the interpreter.
- Our interpreter is one big loop and manipulates its own stack. When
- "invoking itself recursively", it pushes a "return address" on the
- stack. The 68k version just pushed the address of the return
- instruction, as you would expect. The C version pushes a "return
- code" that we dispatch to (with a switch) on return.
-
- If you wonder why we don't use C function calls, the answer is pretty
- straight-forward:
-
- x Function call overhead, as you noticed yourself.
- x Implicit state. Scheme has first-class continuations which imply
- that you must be able to capture the stack of pending operations. The
- code for doing this with the C function-call stack would not be
- portable.
-
- - "primitive procedures" of the language. Again, a primitive of the
- language is naturally represented as the address of its first
- instruction in assembly language. In portable C you would have to use
- a table of procedures (with its associated cost) or another switch
- statement.
-