NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / arch / 8948 < prev next >

Wrap

Internet Message Format | 1992-08-18 | 4.9 KB

Path: sparky!uunet!wupost!darwin.sura.net!mips!swrinde!cs.utexas.edu!loyola!ross.COM!sparhawk!mitch From: mitch@sparhawk.com (Mitch Alsup) Newsgroups: comp.arch Subject: Caches and Hashing Summary: HP Hashing direct mapped caches probably wins Message-ID: <1992Aug18.182506.23744@ross.COM> Date: 18 Aug 92 18:25:06 GMT Sender: news@ross.COM Distribution: usa Organization: Mitch Alsup Consulting Lines: 84 Originator: mitch@sparhawk Nntp-Posting-Host: sparhawk john@netnews.jhuapl.edu (John Hayes): Writes: > There was an interesting idea in a recent article in Microprocessor > Report. The article was describing HP's PA-RISC. Here is what caught > my eye: > "The caches are direct mapped to eliminate the performance > impact of multiplexers that would be required with set- > associativity. To reduce the cache miss rates, the addresses > are hashed before they drive the cache SRAMS."* > This raises several questions in my mind: > 1) How much does hashing improve the miss rate of a direct-mapped > cache? As usual, your milage may vary due to: Application, O/S, cache size. With 65KByte sized caches, many applications see only 5% miss rate (and with a 10 cycle miss penalty they achieve only 50% machine peak performance). With 1MByte sized caches one could expect ~1% miss rate and therefore greater than 90% machine peak performance. Excellent Hashing can be expected to achieve 40% improvement (max-- Moyers rule of SQRT(2)-1 for performance gains). This could allow the 65KB cache to achieve 3% miss rates and 70% peak performance; or the 1MB cache to achieve 94% machine performance. Considerable gains in the first case and useful in the second. The 65KB cache PROBABLY would benefit LESS than the larger cache because it is operating closed to its intrinsic COMPUSOLRY miss rate (it is smaller) while the 1MB cache could benefit more IF the WORKING SET of the applicationS fit in the cache. Also note that the difference in the miss rate for the larger cache in direct mapped configruation to a set associative configuration is considerably smaller than that of the smaller cache. > 2) The argument for direct mapped caches is that they have a better > effective access time than set-associative caches because direct > mapped caches do not need multiplexers to select the data. Would > the cost of hashing eliminate this advantage? In other words: > would a direct mapped cache with hashing or a set-associative > cache (with associated multiplexers) give the best performance? If one measures performance in nanoseconds per delivery of datums, it is quite possible that hashed accesses in direct mapped caches will perform better than set associative caches. In HP's case, this is reinforced due to thier use of commodity SRAMS external to the processor, and cannot, therefore, dink with the decoder to the ram arrays. This should also be the case for internal VLSI ram arrays where the decoder can be set up to dink with the addresses (hash) with very little additional delay (100's of picoseconds). Conversely, the output buffers can be setup to dink with the addresses driven to the SRAMs and incur very little additional delay (.5 ns). On the other hand, if the TAG section is designed so that it has faster access times than the Data section of a VLSI-like cache, the tag access and set comparison can overlap the data access. With this arrangement, the set selection can be driven as the data is sensed, and contribute negligably to the access time. (88200 uses this trick). This requires that the tag section be about 4 gate delays faster than the data section. Careful SPICE modeling is REQUIRED. Your mileage may vary. This trick does not work with commodity SRAM chips because the access time for the tags is commensurate with the access times for the data. In any event, the gain from set associative caches dimminishes with large external caches (greater than 65KBytes). > 3) What hash function does HP use? Hopefully, one which places user code and user data in different cache locations than supervisor code and data. If they also attempt to place I/O pages and data base pages separately from these others, then I can see considerable gain from hasing. On the other hand, why cannot the O/S and Linker conspire to place the pages in physical and logical locations so as to minimize these problems up front? > I would be interestered to hear from anyone who knows the answer > to these questions from papers or first-hand knowledge. I will > summarize to the net. > * Case, Brian, "HP Reveals Superscalar PA-RISC Implementation", > Microprocessor Report, Vol. 6, No. 4, March 25, 1992, p. 17. ------------------------------------------------------------------------------------- Mitch Alsup Currently at: mitch@ross.com An Independent Consultant (512)-263-5086 Evenings Disclaimer: I speak for myself. -------------------------------------------------------------------------------------