home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!noc.near.net!mars.caps.maine.edu!maine.maine.edu!ree700a
- Organization: University of Maine System
- Date: Monday, 27 Jul 1992 10:43:17 EDT
- From: <REE700A@MAINE.MAINE.EDU>
- Message-ID: <92209.104317REE700A@MAINE.MAINE.EDU>
- Newsgroups: comp.sys.ibm.pc.hardware
- Subject: Re: RAM Speed!! (Help Wanted)
- References: <1992Jul23.123703.5170@ncsu.edu>
- Lines: 34
-
- OK, your 386DX (if cached) , 486 SuX or whatever uses a 128 bit cache fill. In
- other words, every time a byte is requested & isn't there, 16 bytes of memory
- (a paragraph) at or around that address are dragged into cache. In a 32 bit ma
- chine, this is done by four successive reads of 4 bytes. (this should point ou
- t why most quality motherboards have FOUR banks of 32 bit memory!!!!!!!)
- The 386 & better does this by requesting one memory location per clock cycle
- even though the previously requested bank is still processing the request. Le
- t's examine a hypothetical 486DX50 with a 20nS clock cycle....
- Cycle 1: address is not in cache. Send request for cache fill(bytes 0-3)
- Cycle 2: Check if memory responded (<40nS SIMM?!)
- Send request for bytes4-7 (regardless of status of bytes 0-3)
- Cycle 3: Check if memory responded (<60nS SIMM...)
- send request for bytes 8-11 .....
- Cycle 4: Read your 80 nS SIMMs (bytes 0-3), Request bytes 12-15.
- look ahead generates request for another cache fill...
- Cycle 5: Read bytes 4-7, waiting for refresh on bytes 0-3...
- Cycle 6: Read bytes 8-11, waiting for refresh on bytes 0-7...
- Cycle 7: Read bytes 12-15, wait for refresh (140 nS) on bytes 0-11
- Cycle 8: wait for refresh on all bytes (160 nS)
- Cycle 9: similar to cycle 1...
-
- Thus, a four way interleave with address pipelining and 128 bit cache line f
- ill (16 bytes) allows the four banks to operate at an average of 2 clock cycles
- per access, even when the memory cycle (access + refresh) is as slow as 160 nS
- (roughly, I know I neglected the tag access time of circa 5 nS and some other
- little odds & ends).
- So, to answer the question, no you are not pushing the SIMMs, (nor can you, t
- he CPU waits for them & does not determine the speed as was done in the 286 & e
- arlier CPU's). The key here is that the instruction processor is asynchronous
- with respect to the memory management unit. A heavy dose of pipeline magic all
- ows the appearance of very fast memory. So, what's the catch? Well, if you br
- anch outside of the cache and around the address look-ahead, you will see a ful
- l 80 nS of access time and 160 nS cycle (or whatever). This is why branch inst
- ructions are evil!!!!!!!!
-