ARM Club 3

home *** CD-ROM | disk | FTP | other *** search

/ ARM Club 3 / TheARMClub_PDCD3.iso / hensa / recreation / fasterfp_2 / fp / Readme < prev next >

Wrap

Text File | 1997-09-03 | 3KB | 60 lines

Readme for NewFP: -=-=-=-=-=-=-=-=- NewFP is simply a hacked version of the FPEmulator. It will work on any RPC and it's main reason for creation was getting Quake to run faster. It provides on average a 50-100% increase in FP operations. Quake now runs at an average of 3-4 fps compared to 2fps! :( A complete rewrite is really what's needed, but I couldn't be bothered as I know no one would buy it. I'm off to PC land now anyway ... A small history: -=-=-=-=-=-=-=-= The FPEmulator as in RO3.7 and every RO version before (yes, it's the same code :( ) was originally written in 1984-1985 on an emulated Arm platform. It was not written for RO, but for ARX, the precursor to RO (long story). Anyway, ARX got axed, and Arthur was cobbled together quickly. Essentially the FPEmulator code was taken, made into a relocatable module, and stuck on the Arthur floppy disc. Various bug fixes ensued, and some optimisation was done for the RO2 release. An interesting note here is that this explains the use of the TRANS pin on Arm2's in the FP code. ARX had different memory maps for each processor state using MEMC's abilities in this regard ie; in supervisor everything was in a different place than in user etc. The TRANS pin (utilised using LDRT or STRT and STMIA ... ^ when PC isn't in the list etc) forces a memory map context change to user mode ie; it accesses using the user mode memory map. RO never had this feature, and hence the TRANS pin was never needed nor did RO code use the TRANS pin. Anyway, the FP emuator was then played with for the Arm2 FP unit (no one bought it), and also when the Arm7 FPU was being dreamed of. The main hacks were the replacement of the trig routines with other FP instructions (not clever) as the envisaged FPU would only do adds and multiplys (the two most common operations). Also for RO3.5 and the Arm6 a fudge was applied to change the environment from 32 bit Arm6 to an emulation of the 26 bit Arm2/3. This code still gets executed every time a FP instruction happens. Never mind this, but the original FP code wasn't exactly quick anyway. All operations, irrespective of the precision you specify, are carried out to extended precision. Yes, all 96 bits worth. And then (best of all) it goes through a rounding process to round the result to the precision you asked for. This happens for *every* FP op. Now Quake only requires single precision, sometimes double. It doesn't require long double. However, I'm not about to rewrite the fp emulator, so instead I hacked the code to stop rounding on exit and instead just truncate. I didn't bother replacing SIN and COS with their original (much faster) code though. Someone else can bother. Anyway, this simple change makes a huge difference. It goes much quicker. But as now have a PC with a Millenium II and GLQuake, I don't really care anymore. I'm off to play Quake at 30fps with bilinear filtered graphics ... :) Anyway, the notes and hatchet basic file are all enclosed. Enjoy someone! Cheers, Niall Douglas.