home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.arch
- Path: sparky!uunet!elroy.jpl.nasa.gov!sdd.hp.com!ux1.cso.uiuc.edu!csrd.uiuc.edu!sp90.csrd.uiuc.edu!grout
- From: grout@sp90.csrd.uiuc.edu (John R. Grout)
- Subject: Re: How does an R4000-style cache work?
- Message-ID: <1993Jan7.201733.16338@csrd.uiuc.edu>
- Keywords: Write-back Cache
- Sender: news@csrd.uiuc.edu
- Reply-To: j-grout@uiuc.edu
- Organization: UIUC Center for Supercomputing Research and Development
- References: <1993Jan6.235455.25425@Princeton.EDU>
- Date: Thu, 7 Jan 93 20:17:33 GMT
- Lines: 82
-
- awolfe@moo.Princeton.EDU (Andrew Wolfe) writes:
-
- >I am unclear on the timing involved when write misses occur on a write-back
- >cache (like in the R4000). Perhaps someone can clarify.
-
- >When a write is attempted - a virtual index is used to select a cache line
- >and a virtual offset selects bytes within that line. This happens during
- >the DF stage in the R4000. The memory can then be accessed during the DS
- >stage. I assume that the tag is read at this time. If a write is allowed
- >to occur at this time - it may destroy data in the cache.
-
- >The tag is checked at the next stage (TC). If the tags match - all is OK -
- >but if the tags do not match, then I have written over the previous data in
- >the cache. If I delay stores until after the tag check - then the memory is
- >busy with another instruction in the pipeline.
-
-
- >I can conceive of several possible solutions:
-
- >1) Always save a copy of the overwritten line (for writeback).
-
- >2) Delay the write - then simultaneously read and write the cache.
- > (write from instruction in TC, read from instruction in MS)
-
- >3) Delay the write - then stall the pipeline on a hit or miss if the next
- > instruction is a load.
-
- >The first two require a heavy-duty dual-ported cache. The third requires
- >stalls that I cannot find documented.
-
- >What method is really used to solve this problem in the R4000 and similar
- >write-back caches?
-
- From "The MIPS R4000 Processor", by Mirapuri, Woodacre and Vasseghi, IEEE
- Micro, IEEE Micro (April 1992), pp. 10-22.
-
- Figure 1 (p. 11) is the "R4000 internal block diagram" which illustrates a
- "write buffer" and a "store buffer/aligner".
-
- From the "CPU pipeline" paragraph (pp. 11-12):
-
- "In the data first (DF) and data second (DS) stages, the R4000
- accesses the data cache, with a new access starting every cycle.
- The MMU translates the data virtual address into a physical address
- during these stages. In the tag check (TC) stage, the R4000 compares
- the data tags from the cache tag array with the translated address to
- determine if the data cache access was a hit. For stores, if the tag
- check passes in TC, the data travel to the store buffer and enter the
- data cache the next time cache bandwidth is available."
-
- From the "Stalls, slips and exceptions" paragraph (pp. 14-17):
-
- "The different stall types include:
-
- Data cache miss, detected by the data tag check
-
- Data first stage stalls, which can occur for
- three mutually exclusive groups of
- instructions...
-
- 3) The pipeline stalls to let the store buffer entries retire
- to memory because control logic has detected a load to the
- same memory location."
-
- So, the indicated solution is:
-
- 4) Put the data in a write buffer until tag check.
-
- On a cache miss, stall the pipeline and write the data to cache as part
- of miss processing.
-
- On a cache hit, put the data to the store buffer and unload it when:
-
- a) The cache bandwidth is available: no pipeline effects.
- b) A load which wants to use the results of the store is
- detected: stall the pipeline and empty the store buffer.
- c) The store buffer is full: not clear (is this handled like case "b"?)
-
- --
- John R. Grout j-grout@uiuc.edu
- University of Illinois, Urbana-Champaign
- Center for Supercomputing Research and Development
-