NetNews Usenet Archive 1992 #20

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #20 / NN_1992_20.iso / spool / comp / sys / mac / misc / 16139 < prev next >

Wrap

Internet Message Format | 1992-09-08 | 5.4 KB

Path: sparky!uunet!cs.utexas.edu!news-is-not-mail From: newton@cs.utexas.edu (Peter Newton) Newsgroups: comp.sys.mac.misc Subject: Virtual memory defeats IIsi cache size performance trick (long) Date: 7 Sep 1992 21:33:45 -0500 Organization: CS Dept, University of Texas at Austin Lines: 124 Message-ID: <18h3e9INNauk@mohawk.cs.utexas.edu> NNTP-Posting-Host: mohawk.cs.utexas.edu What follows is a long discussion of the "large cache" trick for speeding up the Mac IIsi. The bottom line is that the trick appears to be defeated by virtual memory. Details follow. First some background for those who are not aware of this. The Mac IIsi contains 1 MB of RAM that is soldered to the motherboard. This RAM is dual-ported to the system bus and to the onboard video. The result is that accesses to this memory by the CPU contend with those from the video. Hence this memory is effectively extra slow. A program executing in this memory can run at less than half speed in the worst case. An approach to solve this problem has been to devote this memory to 1. Video ram 2. Disk cache (even the slow memory is fast relative to a disk). This way, the slow memory is entirely occupied so your programs can never run in it. You do this by setting the disk cache to a "large" value. See below for how large. The dual-ported memory is physically at address 0. This trick works because Apple uses the 68030's MMU to map it to the high end of a contiguous block of memory. An aside. Another solution to this problem is to run your IIsi set to use less than 256 colors. Black and white is best. This reduces the video's demands on the dual-ported RAM. However, I *like* 256 colors so I always run at that setting. Anyway, I became curious about two things. First, how big must the disk cache be to use up all of the dual-ported RAM? Second, does virtual memory screw up the page mapping and defeat the trick? To explore, I wrote a quick and dirty little program that allocates all the RAM it can and times writes to various locations in memory to look for slow ones. One expects it to find slow memory at the top of the address space, if the disk cache is too small, and this is what happens. Writes to the dual-ported RAM take 2.9 times as long as writes to the regular RAM, assuming the video is set to display 256 colors. How big must the disk cache be? Measurement suggests that the formula that John Bruner sent to me is at least close to correct. Set Disk Cache = 1 MB - VideoRam - SpaceUsedByInits VideoRAM = 640*480 = 300 KB for 256 colors. Inits allocate space in high (and so slow) memory. If you use no inits, SpaceUsedByInits will be zero and you need a disk cache of at least 724 KB. My inits allocate about 500 KB so I can set the disk cache to a much smaller value. Some people on the net have said that 384 KB is enough. That is not strictly correct. It depends on your inits. There is a lot of folklore in this. The next question. What happens when virtual memory is turned on? The answer is unpleasant. The slow memory appears to no longer be mapped to the high end of the address space so the trick stops working. Even with a a disk cache of 768 KB, I find slow memory sprinkled throughout the address space-- appearing wherever pages happen to be mapped. This means that if you must use virtual memory, you may as well use normal disk cache sizes and live with the fact that performance will suffer (perhaps a lot) whenever something important happens to run in the dual-ported RAM. This annoys me a bit since I cannot afford to buy big SIMMS right now and turn off virtual. Maybe Apple (hint!) or someone who knows how will write an init that automatically allocates all of the dual-ported RAM not used by video to something where speed is not necessary-- like the disk cache, even when virtual is on. I think I would buy big SIMMs before solving the problem by buying a video card. There is much room for error in the kind of measurements I made so I am not going to be dogmatic and declare myself infallible. Instead, I will attach the measurement part of the code I used. Mem is a pointer to the address being timed. I do not think that I was just seeing page faults because I measure each location twice. The fault would affect only the first. My system is a IIsi with 5 MB RAM, both Sys 6.0.8 and tuned 7.0.1 tried. 256 colors always. 8 MB of virtual RAM when enabled. I have never turned 32 bit mode on. If you compile this, turn off all optimization and ensure that Mem and j are in registers. I dissassembled the code to ensure that I had Thick C 5.0.1 doing the right thing. Anyone know how the 030's little data cache works? It appears to affect reads but not writes in this context. Start1 = clock(); for (j = 0; j < 50000; j++) { *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; } Start2 = clock(); for (j = 0; j < 50000; j++) { *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; *Mem = j; } printf("%ld %ld %ld\n", Mem - Base, Start2 - Start1, clock() - Start2); -- ---- Peter Newton (newton@cs.utexas.edu)