home *** CD-ROM | disk | FTP | other *** search
- Path: informatik.tu-muenchen.de!fischerj
- From: fischerj@Informatik.TU-Muenchen.DE (Juergen "Rally" Fischer)
- Newsgroups: comp.sys.amiga.programmer
- Subject: Re: doubling pixels horizontally
- Date: 11 Mar 1996 01:47:41 GMT
- Organization: Technische Universitaet Muenchen, Germany
- Distribution: world
- Message-ID: <4i00nt$r3g@sunsystem5.informatik.tu-muenchen.de>
- References: <4f4ibc$gl9@news.cs.tu-berlin.de> <591.6610T1165T2102@login.eunet.no><1045.6611T753T2256@vip.cybercity.dk><4faoe1$47@sunsystem5.informatik.tu-muenchen.de><2991.6612T1034T625@vip.cybercity.dk><576.6613T1070T1730@login.eunet.no><1257.6614T57T922@vip.cybercity.dk> <5257.6639T1152T2935@ifi.uio.no>
- NNTP-Posting-Host: hphalle9.informatik.tu-muenchen.de
- X-Newsreader: TIN [version 1.2 PL2]
-
- Ludvig Pedersen (ludvigp@ifi.uio.no) wrote:
- : >in C code. And as asm needs more instructions than C, it needs the
- : >2 dimensional format even more if you don't wanna lose overwiev. baeh!
- : >:)
-
- : Asm was never design for it, and I don't think it looks good either.
- I redesigned it :D blah.
-
- : >|> >a myth ?
- : >|> I think so. But please show me the copy-loop and I'll test it.
- : >could you please try movem.l (fast)+,d0-d7 and then 8 times move.l dn,(chip)+
- : >?
-
- : I did tried a LOT of different loops and here is a small collection of
- : the top 5 loops I tried. Acutally the result was a little better than I
- : thought.
-
- : ALL DMA IS OFF!
-
-
- : ;Speed: 5.640 MB/s
- (8x load 8x store)
-
- : ; Speed: 5.472 MB/s
- (movem load, 8x store)
-
- huh ? this would equal "movem load 12 regs slower than normal load".
- on 020, movem load is quicker using lot regs. for example I got 9.7 cyc
- movem load from chip instead of the usual 12 cyc.
- And in fastmem 4.9 movem load vs. 6.4 normal load.
-
- so anyone of us got a bug or 030 is very different here ?
-
- : ; Speed: 5.472 MB/s
-
- (movem load,movem store)
-
- same speed, naturally, chip stores can't be speed up.
-
- : ; Speed: 4.896 MB/s
- : move.l (a0)+,(a1)+
-
- : ; Speed: 4.656 MB/s
-
- : move.l (a0)+,d0
- : move.l d0,(a1)+
-
- so who reported this was faster than ()+,()+ ? :)
-
-
- Do those "free c2p" routines really run 5.6 mb/sec ?
-
- : >imho this should do 7mb/sec in the store part. if the movem
- : >is very fast, you aproximate the 7mb/sec also doing copying.
-
- : 7 Mb/s is not possible. Remember that you have to access the same data-bus to
- : read from FastRam.
-
- I said "very fast cpu" and "aproxiamte 7mb/sec" :)
-
- : >On 020-14 it will be slower than normal copy, on 020-28 maybe already
- : >faster (only theory!)
- : >so we still need a test if it's faster than move.l (fast)+,(chip)+
- : >|> Here is my results from bustest:
- : >|>
- : >|> BusSpeedTest 0.07 (mlelstv) Buffer: 16384 Bytes
- : >|> ==================================================
- : >|> loop overhead: 4.5ns
- : >|> register move: 40.6ns
-
- : >huh ? a register move is 2 cycles. you got 24.63 MHz ?
-
- : Ehh..No, I have 50 mhz.
-
- : I tested it myself. (just to be sure)
-
- : I was able to do 24400000 register move's and 203300 dbra's per second.
- 2.04 cycles for the reg move.
- if the dbra number misses 2 zeros, it's 2.45 cycles/dbra (wow!), if
- the "203300" is true, it's 245 cycles (laaaame!) ;)
-
- : A dbra is 3 times slower than a register move so that's 25.0 peek MIPS.
- uhm... yep, 6 cycles dbra on 020, too. exactly enough for the 6 free cycles
- of a chipmem store.
- so on your card rather 8333333 dbra's /sec ;)
-
- : 1.000.000.000 ns / 25.009.900 = 39.98 ns
- what ?
-
- : Check you numbers, its correct!
-
- : >|> memtype op cycle bandwidth
- : >|> fast readw 109.1ns 18.3MByte/s
- : >|> fast readl 137.6ns 29.1MByte/s
- : >|> fast readm 167.7ns 23.8MByte/s
-
- : >readm slower ? hmhmhmhm. nooo.
- : Ohhh-yes.. Just look at the copy results.
- mhm how much cycles is move.b (an),dn ?
- I hope this improved (other possibility: movem load worse vs. 020)
- ??
-
- : >maybe you can write in 10 14-mhz cycle parts, i.e 5.672mb/sec theoretic
- : >if no dma at all.
- : Yes, write 5.6 MB/s to ChipRam with no DMA.
- hehe, but with another routine than my theory refered to ;)
- the 5.6 mb/sec you get with 8 cycle stores + fastmem loads.
- my theory said 10 cycles for move.l (an)+,(an)+, but that one is slower...
-
- : We support both 2xN sprite-dithering (1 pass) and normal 2xN (2 pass).
-
- : If your render-routine is 25 fps or slower using the 2 pass version
- : doesnt matter at all in speed and framerate. You only get a better-looking
- : display.
- yep. no need for ghost-look doing doom, descent etc.
- just intuiscreen, supporting all monitors, or genlock or whatever :)
-
- : Can you explain about that monitor side-effects stuff you are talking about.
- : Is this something new?
- my 1084 shows 2x2 ghost screens well, same goes for my TV over both
- SCART and composite video.
-
- my friends tv would show some color stripes, i.e. disturbed display.
- (could still play xtreme racing :)
-
- i.e 1084 -> most users no problemo. ghost-look looks even better
- over composite video vs. rgb-port, because of smear (hardware-dither ;)
-
- : <sb>Ludde - Amiga Demo Coder
- : <sb>Virtual Reality & Official Be developer
- : <sb>ludvigp@ifi.uio.no
-
-