home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.unix.bsd
- Path: sparky!uunet!cis.ohio-state.edu!zaphod.mps.ohio-state.edu!wupost!darwin.sura.net!uvaarpa!cv3.cv.nrao.edu!laphroaig!cflatter
- From: cflatter@nrao.edu (Chris Flatters)
- Subject: Re: Jolitz 386BSD-0.1 -- floating point perform
- Message-ID: <1992Jul24.161646.22896@nrao.edu>
- Sender: news@nrao.edu
- Reply-To: cflatter@nrao.edu
- Organization: NRAO
- References: <l6qc51INN1gu@neuro.usc.edu>
- Date: Fri, 24 Jul 1992 16:16:46 GMT
- Lines: 50
-
- In article l6qc51INN1gu@neuro.usc.edu, merlin@neuro.usc.edu (merlin) writes:
- >I have most of the US Army BRLCAD three dimensional CSG modeling and
- >distributed ray tracing system ported to the Jolitz 386BSD-0.1. But,
- >I am getting only about one fifth of the floating point performance
- >previously measured using AT&T pcc and GNU gcc 1.4x on ATT UNIX SYSV.
- >
- >Does the compiler default to '387 emulation? Is there some flag which
- >needs to be set to actually use the coprocessor? Or are there reasons
- >386BSD-0.1 would exhibit relatively poor floating point performance?
-
- I ran some checks last night and 386BSD is certainly exploiting the coprocessor.
- These are the results from the Plum2 benchmark (See section 8.2 of "C++
- Programming Guidelines" by Thomas Plum and Dan Saks. The results are
- the average time for a register int, auto short, auto long and auto float
- operation and the average time to call and return from an empty function.
- Times are in nominal milliseconds (CLOCKS_PER_SEC was missing from <time.h>
- so I guessed a value of 100 --- I now think that it should have been 60.
- The tests were performed on a CompuAdd 325s (25MHz 80387SX CPU) with a
- Cyrix 83S87 FasMath coprocessor.
-
- register auto auto function auto
- int short long call+ret double
- 386BSD gcc 0.178 0.448 0.474 1.62 4.94
- 386BSD gcc -O 0.159 0.207 0.159 1.75 3.37
-
- The ration of floating-point time to auto long is 21.2 (with optimization)
- which is in the correct ball park for a 386SX/387SX system but a little
- on the long size.
-
- As a control, I made a copy of the dist.fs disk with a compiled version of
- bench2 on it and booted it on my portable: a 16 MHz 80386SX system without
- a coprocessor. The results were
-
- register auto auto function auto
- int short long call+ret double
- 386BSD gcc -O 0.240 0.317 0.242 2.32 346
-
- Note that the ratio of of f-p time to auto long is now 1429.8 --- in other
- words emulation is more than 60 times slower than the coprocessor. Unless
- BRLCAD uses very little floating-point I believe that the coprocessor is
- active on Alexander-James Annala's machine too (If Alexander-James wants to
- try these tests I'll send him the source code if he drops me a line).
-
- For final comparison, I have some old figures from Linux with gcc 2.1.
- Using the register int time to place the results on the same scale as
- the 25MHz results above the mean time for a f-p operation was 2.09 usec
- without optimization and 0.936 usec at -O1 and above.
-
- Chris Flatters
- cflatter@nrao.edu
-