home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ARM Club 3
/
TheARMClub_PDCD3.iso
/
hensa
/
recreation
/
fasterfp_2
/
fp
/
Readme
< prev
next >
Wrap
Text File
|
1997-09-03
|
3KB
|
60 lines
Readme for NewFP:
-=-=-=-=-=-=-=-=-
NewFP is simply a hacked version of the FPEmulator. It will work on any RPC
and it's main reason for creation was getting Quake to run faster. It
provides on average a 50-100% increase in FP operations. Quake now runs at
an average of 3-4 fps compared to 2fps! :(
A complete rewrite is really what's needed, but I couldn't be bothered as
I know no one would buy it. I'm off to PC land now anyway ...
A small history:
-=-=-=-=-=-=-=-=
The FPEmulator as in RO3.7 and every RO version before (yes, it's the same
code :( ) was originally written in 1984-1985 on an emulated Arm platform. It
was not written for RO, but for ARX, the precursor to RO (long story).
Anyway, ARX got axed, and Arthur was cobbled together quickly. Essentially
the FPEmulator code was taken, made into a relocatable module, and stuck on
the Arthur floppy disc.
Various bug fixes ensued, and some optimisation was done for the RO2 release.
An interesting note here is that this explains the use of the TRANS pin on
Arm2's in the FP code. ARX had different memory maps for each processor state
using MEMC's abilities in this regard ie; in supervisor everything was in a
different place than in user etc. The TRANS pin (utilised using LDRT or STRT
and STMIA ... ^ when PC isn't in the list etc) forces a memory map context
change to user mode ie; it accesses using the user mode memory map. RO never
had this feature, and hence the TRANS pin was never needed nor did RO code
use the TRANS pin.
Anyway, the FP emuator was then played with for the Arm2 FP unit (no one
bought it), and also when the Arm7 FPU was being dreamed of. The main hacks
were the replacement of the trig routines with other FP instructions (not
clever) as the envisaged FPU would only do adds and multiplys (the two most
common operations). Also for RO3.5 and the Arm6 a fudge was applied to change
the environment from 32 bit Arm6 to an emulation of the 26 bit Arm2/3. This
code still gets executed every time a FP instruction happens.
Never mind this, but the original FP code wasn't exactly quick anyway. All
operations, irrespective of the precision you specify, are carried out to
extended precision. Yes, all 96 bits worth. And then (best of all) it goes
through a rounding process to round the result to the precision you asked
for. This happens for *every* FP op.
Now Quake only requires single precision, sometimes double. It doesn't
require long double. However, I'm not about to rewrite the fp emulator, so
instead I hacked the code to stop rounding on exit and instead just truncate.
I didn't bother replacing SIN and COS with their original (much faster) code
though. Someone else can bother.
Anyway, this simple change makes a huge difference. It goes much quicker. But
as now have a PC with a Millenium II and GLQuake, I don't really care
anymore. I'm off to play Quake at 30fps with bilinear filtered graphics ...
:)
Anyway, the notes and hatchet basic file are all enclosed. Enjoy someone!
Cheers,
Niall Douglas.