NetNews Usenet Archive 1992 #18

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #18 / NN_1992_18.iso / spool / comp / sys / sgi / 12784 < prev next >

Wrap

Internet Message Format | 1992-08-22 | 2.8 KB

Path: sparky!uunet!ogicse!plains!news.u.washington.edu!milton!chuckb From: chuckb@milton.u.washington.edu (Chuck Bass) Newsgroups: comp.sys.sgi Subject: Fastest way to transform points??? Message-ID: <chuckb.714532056@milton> Date: 23 Aug 92 01:07:36 GMT Article-I.D.: milton.chuckb.714532056 Sender: news@u.washington.edu (USENET News System) Organization: University of Washington Lines: 80 I recently attempted to speed up some things that I was doing. Namely I wanted to decrease a bottleneck's effect. I found I was transforming points and multiplying matrices a lot. I us a standard matrix multiplier for this purpose. IE unrolled 4x4 matrix multiply. I attempted to improve performace using something like pushmatrix(); loadmatrix(M1); multmatrix(M2); getmatrix(M3); popmatrix(); This fragment turns out to be about 1/3 the speed of my 4x4 matrix multiply 100000 mat multiplies took 2.51 seconds 100000 point transforms took 0.76 seconds 100000 sgmat multiplies took 11.27 seconds The 1.05 second case is uses stack manipulations. I rewrote the code to only do the following: multmatrix(M2) This code did the following: 100000 mat multiplies took 2.51 seconds 100000 point transforms took 0.76 seconds 100000 sgmat multiplies took 2.45 seconds This leads me to believe that there is no hardware involved in the stack multmatrix routine. I suspect that the difference in performance is because of the matrix inversion that takes place when a loadmatrix call is made. These results seem to be somewhat consistant. IE a point takes around 1/4th the time of a matrix multiply. (I did do an sginap after I opend the window to give the window manager a chance to open the window etc) These results are for on a PI 4D25 (the 4D35 gives similar results only faster) hinv says: 1 20 MHZ IP6 Processor FPU: MIPS R2010A/R3010 VLSI Floating Point Chip Revision: 2.0 CPU: MIPS R2000A/R3000 Processor Chip Revision: 2.0 On-board serial ports: 2 Data cache size: 32 Kbytes Instruction cache size: 64 Kbytes Main memory size: 16 Mbytes Integral Ethernet: ec0, version 0 Genlock option installed Tape drive: unit 2 on SCSI controller 0: QIC 150 Disk drive: unit 1 on SCSI controller 0 Integral SCSI controller 0: Version WD33C93A Graphics board: GR1.2 Bit-plane, Z-buffer, Turbo options installed My question is. Is there a faster way using some of the "built in matrix 'stuff'"? If there is not are there faster ways of making the transform. I know I can reduce it to a 4x3 matrix multiply for a gain of 25%. Are there other such optimizations? Thanks, Chuck Bass College of Forest Systems Engineering University of Washington chuckb@u.washington.edu PS I need to transform the points to do collision detection. Currently the function clipbox is not implemented on our machine ;-(.