home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!cs.utexas.edu!news-is-not-mail
- From: newton@cs.utexas.edu (Peter Newton)
- Newsgroups: comp.sys.mac.misc
- Subject: Virtual memory defeats IIsi cache size performance trick (long)
- Date: 7 Sep 1992 21:33:45 -0500
- Organization: CS Dept, University of Texas at Austin
- Lines: 124
- Message-ID: <18h3e9INNauk@mohawk.cs.utexas.edu>
- NNTP-Posting-Host: mohawk.cs.utexas.edu
-
- What follows is a long discussion of the "large cache" trick for
- speeding up the Mac IIsi. The bottom line is that the trick appears
- to be defeated by virtual memory. Details follow.
-
- First some background for those who are not aware of this. The Mac
- IIsi contains 1 MB of RAM that is soldered to the motherboard. This
- RAM is dual-ported to the system bus and to the onboard video. The
- result is that accesses to this memory by the CPU contend with those
- from the video. Hence this memory is effectively extra slow. A
- program executing in this memory can run at less than half speed in
- the worst case. An approach to solve this problem has been to devote
- this memory to
-
- 1. Video ram
- 2. Disk cache (even the slow memory is fast relative to a disk).
-
- This way, the slow memory is entirely occupied so your programs can
- never run in it. You do this by setting the disk cache to a "large"
- value. See below for how large.
-
- The dual-ported memory is physically at address 0. This trick works
- because Apple uses the 68030's MMU to map it to the high end of a
- contiguous block of memory.
-
- An aside. Another solution to this problem is to run your IIsi set to
- use less than 256 colors. Black and white is best. This reduces the
- video's demands on the dual-ported RAM. However, I *like* 256 colors
- so I always run at that setting.
-
- Anyway, I became curious about two things. First, how big must the
- disk cache be to use up all of the dual-ported RAM? Second, does
- virtual memory screw up the page mapping and defeat the trick?
-
- To explore, I wrote a quick and dirty little program that allocates
- all the RAM it can and times writes to various locations in memory to
- look for slow ones. One expects it to find slow memory at the top of
- the address space, if the disk cache is too small, and this is what
- happens. Writes to the dual-ported RAM take 2.9 times as long as
- writes to the regular RAM, assuming the video is set to display 256
- colors.
-
- How big must the disk cache be? Measurement suggests that the formula
- that John Bruner sent to me is at least close to correct.
-
- Set Disk Cache = 1 MB - VideoRam - SpaceUsedByInits
-
- VideoRAM = 640*480 = 300 KB for 256 colors. Inits allocate space in
- high (and so slow) memory. If you use no inits, SpaceUsedByInits will
- be zero and you need a disk cache of at least 724 KB. My inits
- allocate about 500 KB so I can set the disk cache to a much smaller
- value. Some people on the net have said that 384 KB is enough. That
- is not strictly correct. It depends on your inits. There is a lot
- of folklore in this.
-
- The next question. What happens when virtual memory is turned on?
- The answer is unpleasant. The slow memory appears to no longer be
- mapped to the high end of the address space so the trick stops
- working. Even with a a disk cache of 768 KB, I find slow memory
- sprinkled throughout the address space-- appearing wherever pages
- happen to be mapped. This means that if you must use virtual memory,
- you may as well use normal disk cache sizes and live with the fact
- that performance will suffer (perhaps a lot) whenever something
- important happens to run in the dual-ported RAM.
-
- This annoys me a bit since I cannot afford to buy big SIMMS right now
- and turn off virtual. Maybe Apple (hint!) or someone who knows how
- will write an init that automatically allocates all of the dual-ported
- RAM not used by video to something where speed is not necessary-- like
- the disk cache, even when virtual is on. I think I would buy big SIMMs
- before solving the problem by buying a video card.
-
- There is much room for error in the kind of measurements I made so I
- am not going to be dogmatic and declare myself infallible. Instead, I
- will attach the measurement part of the code I used. Mem is a pointer
- to the address being timed. I do not think that I was just seeing
- page faults because I measure each location twice. The fault would
- affect only the first. My system is a IIsi with 5 MB RAM, both Sys
- 6.0.8 and tuned 7.0.1 tried. 256 colors always. 8 MB of virtual RAM
- when enabled. I have never turned 32 bit mode on.
-
- If you compile this, turn off all optimization and ensure that Mem and
- j are in registers. I dissassembled the code to ensure that I had
- Thick C 5.0.1 doing the right thing. Anyone know how the 030's little
- data cache works? It appears to affect reads but not writes in this
- context.
-
- Start1 = clock();
- for (j = 0; j < 50000; j++) {
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- }
- Start2 = clock();
- for (j = 0; j < 50000; j++) {
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- *Mem = j;
- }
- printf("%ld %ld %ld\n", Mem - Base, Start2 - Start1, clock() - Start2);
- --
- ----
- Peter Newton (newton@cs.utexas.edu)
-