NetNews Usenet Archive 1992 #26

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #26 / NN_1992_26.iso / spool / comp / arch / 10675 < prev next >

Wrap

Internet Message Format | 1992-11-12 | 2.5 KB

Xref: sparky comp.arch:10675 comp.lang.forth:3485 Path: sparky!uunet!know!cass.ma02.bull.com!mips2!news.bbn.com!usc!zaphod.mps.ohio-state.edu!darwin.sura.net!Sirius.dfn.de!Urmel.Informatik.RWTH-Aachen.DE!messua!dak From: dak@messua.informatik.rwth-aachen.de (David Kastrup) Newsgroups: comp.arch,comp.lang.forth Subject: Re: What's RIGHT with stack machines Message-ID: <dak.721621337@messua> Date: 13 Nov 92 02:22:17 GMT References: <Bx5AIr.EAy.2@cs.cmu.edu> <1992Nov4.103008.2641@Informatik.TU-Muenchen.DE> <MIKE.92Nov9004026@guam.vlsivie.tuwien.ac.at> <id.D6UU.5Z@ferranti.com> <lg0eheINNs7l@exodus.Eng.Sun.COM> Sender: news@Urmel.Informatik.RWTH-Aachen.DE (Newsfiles Owner) Organization: Rechnerbetrieb Informatik / RWTH Aachen Lines: 35 Nntp-Posting-Host: messua >>I predict that before too long all high performance commodity micros will do >>scheduling at runtime. >I don't think the situation is as clear-cut as you describe it. >There are certain scheduling techniques that tend to work well no >matter where you use them -- as long as you have enough registers, it >doesn't hurt to stick a few instructions between a load into a >register and the subsequent use of that register. On superscalar >machines, it is generally a bad idea to do too many of exactly the >same thing in a big lump (i.e., ld, ld, ld or fadd, fadd, fadd), and >if you have the option of mixing things up a bit, you should. >Increasing the size of basic blocks (through code replication, >typically) is another trick for helping most machines, since branches >often stall pipelines. >These are general rules, and they won't yield optimum performance, but >you must trade them off against the costs of generating >implementation-specific code. Those costs include > (1) less sharing of text and libraries > (2) lots of cache flushing and > (3) scheduling and register allocation are not necessarily cheap. See the MIPS processors (micro without interlocking pipeline stages) for a clever design idea: they have simply left out all instruction scheduling. If you start a command using a register which a previous command still has to fill up, there is no delay, but the OLD value is used. So the compiler/assembler will have to include nops by hand in order to prevent register clashes. On the other hand, command interlocking can be done by compilers. And you can use the die space thus gained for other purposes (bigger cashes, etc.). And because of the Harvard bus architecture, there is no difference between waiting for interlock or fetching a nop. Only problem: code tends to get longer.