home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.arch:10531 comp.lang.forth:3459
- Newsgroups: comp.arch,comp.lang.forth
- Path: sparky!uunet!email!vlsivie.tuwien.ac.at!mike
- From: mike@vlsivie.tuwien.ac.at (Michael Gschwind)
- Subject: Re: What's RIGHT with stack machines
- In-Reply-To: pazsan@Informatik.TU-Muenchen.DE's message of Wed, 4 Nov 1992 10: 30:08 GMT
- Message-ID: <MIKE.92Nov9004026@guam.vlsivie.tuwien.ac.at>
- Sender: news@email.tuwien.ac.at
- Nntp-Posting-Host: guam.vlsivie.tuwien.ac.at
- Organization: Vienna University of Technology
- References: <Bx5AIr.EAy.2@cs.cmu.edu> <1992Nov4.103008.2641@Informatik.TU-Muenchen.DE>
- Date: Sun, 8 Nov 1992 23:40:26 GMT
- Lines: 50
-
- In article <1992Nov4.103008.2641@Informatik.TU-Muenchen.DE> pazsan@Informatik.TU-Muenchen.DE (Bernd Paysan) writes:
- Thanks to your interresting report. I have little to add.
-
- Code size and performance: reduced code size for equal operations means less bus
- bandwidth is requested. As busses (and DRAMs) are today the von Neumann's
- bottlenecks, stack CPUs reduces costs for main memory and second level caches and
- may increase performance, if the costs are equal.
-
- Yes, but you pay a heavy price. Stack machines basically compute
- expressions of the type
- ((((...) op A) op B) op C)
- so each operation depends on the previous (if nothing else, since all
- ops operate on the stack, there is the stack pointer dependency), so
- super scalar or even deep pipelining are totally out of the question.
- I used to like stack machines, they are beautiful, simple, allow easy
- compilers, BUT THEY JUST DID NOT SCALE - neither in modern design
- technologies nor when it comes to compilers.
-
- Compiler complexity: Tanenbaum had a compiler project, that did much optimizing
- on the intermediate stack machine code. I think you have little to do to convert
- this code for a stack machine, whereas you have a lot of work to allocate
- registers and schedule instructions, as you have to do on "conventional" RISC
- processors.
-
- Yes, for simple arithmetic expressions, they allow way cool
- optimizations. But what do you do when you get to things like CSE? DUP
- DUP SWAP SWAP ROT? Hardly efficient! 2 memory accesses for DUP, 4 for
- swap, 6 for ROT. You simply have to shuffle intermediate results too
- much. Or save them in memory - definitely more expensive than a
- register. Once again, with technology of 10 years ago, they were nice,
- but it does pay to allocate registers and do scheduling, AND WE HAVE
- THE TECHNOLOGY NOW to do it.
-
- Spill&Fill: did anyone thought about automatic spill&fill-buffers? (if the
- available stack cells decreases to a certain level, a number of stack items are
- load (4 or 8 would be great), and if it increases over a certain level, the same
- amount of stack items is spilled (have a hysterese of about 8 stack entries).
-
- I figure implementation of this is just to hairy - on the same real
- estate, you can put 16 extra registers ;)
-
- mike
- --
-
- Michael Gschwind, Dept. of VLSI-Design, Vienna University of Technology
- mike@vlsivie.tuwien.ac.at 1-2-3-4 kick the lawsuits out the door
- (currently somewhere in 5-6-7-8 innovate don't litigate
- the Bay Area, back sometime 9-A-B-C interfaces should be free
- at the end of this year) D-E-F-O look and feel has got to go!
-
-