NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / sys / amiga / programmer / 5200 < prev next >

Wrap

Internet Message Format | 1996-08-05 | 5.0 KB

Path: informatik.tu-muenchen.de!fischerj From: fischerj@Informatik.TU-Muenchen.DE (Juergen "Rally" Fischer) Newsgroups: comp.sys.amiga.programmer Subject: Re: doubling pixels horizontally Date: 11 Mar 1996 01:47:41 GMT Organization: Technische Universitaet Muenchen, Germany Distribution: world Message-ID: <4i00nt$r3g@sunsystem5.informatik.tu-muenchen.de> References: <4f4ibc$gl9@news.cs.tu-berlin.de> <591.6610T1165T2102@login.eunet.no><1045.6611T753T2256@vip.cybercity.dk><4faoe1$47@sunsystem5.informatik.tu-muenchen.de><2991.6612T1034T625@vip.cybercity.dk><576.6613T1070T1730@login.eunet.no><1257.6614T57T922@vip.cybercity.dk> <5257.6639T1152T2935@ifi.uio.no> NNTP-Posting-Host: hphalle9.informatik.tu-muenchen.de X-Newsreader: TIN [version 1.2 PL2] Ludvig Pedersen (ludvigp@ifi.uio.no) wrote: : >in C code. And as asm needs more instructions than C, it needs the : >2 dimensional format even more if you don't wanna lose overwiev. baeh! : >:) : Asm was never design for it, and I don't think it looks good either. I redesigned it :D blah. : >|> >a myth ? : >|> I think so. But please show me the copy-loop and I'll test it. : >could you please try movem.l (fast)+,d0-d7 and then 8 times move.l dn,(chip)+ : >? : I did tried a LOT of different loops and here is a small collection of : the top 5 loops I tried. Acutally the result was a little better than I : thought. : ALL DMA IS OFF! : ;Speed: 5.640 MB/s (8x load 8x store) : ; Speed: 5.472 MB/s (movem load, 8x store) huh ? this would equal "movem load 12 regs slower than normal load". on 020, movem load is quicker using lot regs. for example I got 9.7 cyc movem load from chip instead of the usual 12 cyc. And in fastmem 4.9 movem load vs. 6.4 normal load. so anyone of us got a bug or 030 is very different here ? : ; Speed: 5.472 MB/s (movem load,movem store) same speed, naturally, chip stores can't be speed up. : ; Speed: 4.896 MB/s : move.l (a0)+,(a1)+ : ; Speed: 4.656 MB/s : move.l (a0)+,d0 : move.l d0,(a1)+ so who reported this was faster than ()+,()+ ? :) Do those "free c2p" routines really run 5.6 mb/sec ? : >imho this should do 7mb/sec in the store part. if the movem : >is very fast, you aproximate the 7mb/sec also doing copying. : 7 Mb/s is not possible. Remember that you have to access the same data-bus to : read from FastRam. I said "very fast cpu" and "aproxiamte 7mb/sec" :) : >On 020-14 it will be slower than normal copy, on 020-28 maybe already : >faster (only theory!) : >so we still need a test if it's faster than move.l (fast)+,(chip)+ : >|> Here is my results from bustest: : >|> : >|> BusSpeedTest 0.07 (mlelstv) Buffer: 16384 Bytes : >|> ================================================== : >|> loop overhead: 4.5ns : >|> register move: 40.6ns : >huh ? a register move is 2 cycles. you got 24.63 MHz ? : Ehh..No, I have 50 mhz. : I tested it myself. (just to be sure) : I was able to do 24400000 register move's and 203300 dbra's per second. 2.04 cycles for the reg move. if the dbra number misses 2 zeros, it's 2.45 cycles/dbra (wow!), if the "203300" is true, it's 245 cycles (laaaame!) ;) : A dbra is 3 times slower than a register move so that's 25.0 peek MIPS. uhm... yep, 6 cycles dbra on 020, too. exactly enough for the 6 free cycles of a chipmem store. so on your card rather 8333333 dbra's /sec ;) : 1.000.000.000 ns / 25.009.900 = 39.98 ns what ? : Check you numbers, its correct! : >|> memtype op cycle bandwidth : >|> fast readw 109.1ns 18.3MByte/s : >|> fast readl 137.6ns 29.1MByte/s : >|> fast readm 167.7ns 23.8MByte/s : >readm slower ? hmhmhmhm. nooo. : Ohhh-yes.. Just look at the copy results. mhm how much cycles is move.b (an),dn ? I hope this improved (other possibility: movem load worse vs. 020) ?? : >maybe you can write in 10 14-mhz cycle parts, i.e 5.672mb/sec theoretic : >if no dma at all. : Yes, write 5.6 MB/s to ChipRam with no DMA. hehe, but with another routine than my theory refered to ;) the 5.6 mb/sec you get with 8 cycle stores + fastmem loads. my theory said 10 cycles for move.l (an)+,(an)+, but that one is slower... : We support both 2xN sprite-dithering (1 pass) and normal 2xN (2 pass). : If your render-routine is 25 fps or slower using the 2 pass version : doesnt matter at all in speed and framerate. You only get a better-looking : display. yep. no need for ghost-look doing doom, descent etc. just intuiscreen, supporting all monitors, or genlock or whatever :) : Can you explain about that monitor side-effects stuff you are talking about. : Is this something new? my 1084 shows 2x2 ghost screens well, same goes for my TV over both SCART and composite video. my friends tv would show some color stripes, i.e. disturbed display. (could still play xtreme racing :) i.e 1084 -> most users no problemo. ghost-look looks even better over composite video vs. rgb-port, because of smear (hardware-dither ;) : <sb>Ludde - Amiga Demo Coder : <sb>Virtual Reality & Official Be developer : <sb>ludvigp@ifi.uio.no