home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!mcsun!uknet!mucs!cs.man.ac.uk!endecotp
- From: endecotp@cs.man.ac.uk (PB Endecott (PhD SFurber))
- Newsgroups: comp.arch
- Subject: Re: why no register + register addressing mode in R3000 (repost)
- Message-ID: <endecotp.721675619@cs.man.ac.uk>
- Date: 13 Nov 92 17:26:59 GMT
- References: <lg5i5oINN1q4@exodus.Eng.Sun.COM>
- Sender: news@cs.man.ac.uk
- Lines: 53
-
- tremblay@flayout.Eng.Sun.COM (Marc Tremblay) writes:
-
-
- >Here is the percentage of loads and stores for SPECint92 benchmarks
- >running on SPARC which use the two register addressing mode [RX + RY]:
-
- >benchmark loads stores
- >--------- ----- ------
- >espresso 37% 9.8%
- >li 2.5% 2.8%
- >eqntott 76.4% 5.4%
- >compress 27% 7.3%
- >sc 13.5% 7.9%
- >gcc 18.9% 37.5%
-
- Interesting numbers, thank you.
-
- I calculate from this that an average of 11.8% of stores and 29.2% of loads
- use a two-register addressing mode.
-
- Using numbers from Hennessey & Patterson fig 4.34 (sorry I don't have them
- for the SPEC benchmarks), 8% of instructions in dynamic execution are
- stores, and 18% are loads. So, (8% x 11.8%) = 0.9% of instructions are
- stores that use two registers, and (18% x 29.2%) = 5.3% of instructions are
- loads that use two registers; a total of 6.2% of instructions use
- two-register addressing modes.
-
- On an architecture without this mode, assuming that each of these
- operations would need a separate ADD to calculate the address, the number
- of instructions needed to do the same work would increase by 6.2%. Does a
- 6.2% performance increase justify an extra register read port ?
-
- The answer depends on the actual cost of adding the read port, and other
- factors such as the impact on code density of the extra code.
-
- Question : does the Sparc do both sorts of addressing in the same number of
- cycles, or does it use an extra cycle for reading the extra register ?
-
- The 6.2% is a maximum; clever compiler techniques should reduce this.
-
- Consider a processor which implements register+register for loads (where
- the extra port is available) but not for stores, where the port is needed
- for the data. I believe that the HP-PA does this. From the above numbers,
- 0.9% of instructions need an extra ADD for address calculations compared to
- a machine with register+register for loads and stores. Is this a good
- compromise ?
-
- In a superscalar implementation, you have lots of register ports. For
- example, you may have four read ports so that you can do two ALU operations
- simultaneously. You could then do a store with two registers for the
- address at the same time as a monadic ALU operation!
-
- --Phil.
-