home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!bunker!nuconvex!starpt!doiron
- From: doiron@starpt.UUCP (Glenn Doiron)
- Newsgroups: comp.sys.amiga.programmer
- Subject: Re: Chunky Pixels vs. Bitplanes (was: Chunky Chip Set...)
- Message-ID: <doiron.0kil@starpt.UUCP>
- Date: 3 Jan 93 00:16:36 GMT
- References: <Karsten_Weiss.0n2o@ibase.stgt.sub.org> <1hbngoINNglt@uwm.edu> <1993Jan1.141207.20262@mpifr-bonn.mpg.de>
- Organization: 68K Software Development
- Lines: 247
- X-NewsSoftware: GRn 1.16f (10.17.92) by Mike Schwartz & Michael B. Smith
-
- In article <1993Jan1.141207.20262@mpifr-bonn.mpg.de> mlelstv@speckled.mpifr-bonn.mpg.de (Michael van Elst) writes:
- > In <doiron.0kd5@starpt.UUCP> doiron@starpt.UUCP (Glenn Doiron) writes:
- > >>
- > >> Never said that you get an advantage in this situation. Here it is a tie.
- >
- > >It's a tie if you're using special hardware, with a 32-point wide vertical
- > >line which happens to lie entirely within the span of a single longword.
- > >Wow, it was a tie. Now, how about some more realistic examples?
- >
- > We were talking about blitting a rectangle.
-
- ... and you still haven't convinced me that chunky isn't faster. As was
- said before, the best you can hope for is break-even, and far more often
- chunky will be faster - the bigger the blit, the smaller the difference.
-
- > >> >> Simple. The display hardware to _show_ bitplanes needs additional
- > >> >> shiftregisters and buffer which makes the hardware more expensive.
- > >>
- > >> >Nope. RAMDAC's show the display, buddy. Not the TI chip. And getting a
- > >> >planar display out of VRAM is a trick I'd like to see... well it could be
- > >> >done, if you want to store one plane in each VRAM, then muck about with the
- > >> >hardware, majorly.
- > >>
- > >> Sure, RAMDAC's generate the analog signal. RAMDAC's however use chunky pixels
- > >> (they need all bits of a pixel to generate the analog signal for the pixel).
- >
- > >So what, planar systems still need all bits of a pixel to generate the
- > >pixel, too. Your point?
- >
- > Read again. I said one would need additional shift registers and buffers
- > for the display hardware. You denied telling that the RAMDAC would do that.
- > I pointed out that all RAMDACS await chunky pixels which can't be delivered
- > directly by a plane oriented memory system. So you need _shift registers_
- > and _buffers_.
-
- OK. Point granted, I took that statement the wrong way.
-
- > >> So if you have a plane oriented memory (which is no real problem with VRAMs
- > >> although you may get some alignment restrictions) then you have to convert
- > >> it in hardware. This needs _shiftregisters_ and _buffers_. An example for
- > >> such a hardware is the Denise and the Lisa chip (both also have a priority
- > >> logic and Denise includes the RAMDAC).
- >
- > >... which use DRAM's, not VRAMS. No amount of 'alignment restrictions' is
- > >going to cut it for VRAM planar, unless, as I said, you do funky
- > >address/data schemes and put one plane in each VRAM, then mess further with
- > >your RAMDAC feeding setup. Using VRAMS like DRAMS will result in a loss of
- > >all the performance gains VRAMS were supposed to give you in the first
- > >place.
- >
- > Rubbish. VRAMs simply give you a second I/O port for sequential output.
- > The advantage is that you all the display refresh accesses do not stall
- > the first port used by the rendering processor (CPU, blitter, what else).
- > This gives you twice the bandwidth at 130% the cost for this special
- > application.
-
- As I said, you lose all the performance gains VRAMS were supposed to give
- you in the first place, since you can't use the serial registers for video
- fetch in planar setups.
-
- > >> >> 32x32 pixels (well-aligned) with 8 bitplanes need 2x32x8 fetches and stores.
- > >> >> (2 words per line, 32 lines, 8 planes). In chunky you have 16x32 fetches
- > >> >> and stores (16 wrods per line, 32 lines).
- > >>
- > >> >Once again, you are considering the best, break-even case, where all your
- > >> >data is perfectly sized and aligned. If you're going to be doing just
- > >> >that, why bother with bitmap displays at all, when a character based
- > >> >display will do just as good?
- > >>
- > >> Because we do not talk about _text_ displays. You are right that we get
- > >> some inefficiencies with non-aligned data. But so with chunky pixels
- > >> if your memory isn't pixel oriented (f.e. if the memory is 16bit wide
- > >> you "waste" bandwidth on odd addresses).
- > >>
- > >> Thus bitplane devices are usually same speed as chunky pixel devices. But
- > >> you can scale bitplane devices easily with the obvious performance advantages.
- > >> You can never scale chunky pixel devices.
- >
- > >"Scale"?? Remember, scaling is at least (in this example) 8x slower in
- > >planar than chunky.
- >
- > I'm talking about "scaling" the display to different display depths.
-
- Chunky has this property too, although only to a power of two. This
- however, is not nearly good enough justification for all the penalties you
- get in planar doing everyday operations. Why use a lesser number of colors
- when you can use more for the same, if not better, performance?
-
- > And as I also pointed out, bitmap scaling (that's what you thought of)
- > is about same speed for chunky and bitplanes.
-
- No. It is not. You are doing N time the amount of work in planar, where N
- is the number of planes.
-
- > >> >Oh, obviously. If you can make this magical piece of hardware, I'm sure
- > >> >Commodore has a job position open for you. Or any other number of chip
- > >> >designers.
- > >>
- > >> I could probably make that from discrete parts of fpgas but I have no
- > >> experience in doing asics or even full custom designs.
- >
- > >Why hasn't someone done it already? Musn't be very cost-effective.
- >
- > Correct. Bitplanes have the additional burden to feed a RAMDAC still
- > with chunky pixels. Read what I said above. If someone came out with
- > a RAMDAC that includes the necessary logic the price difference would
- > become much smaller.
-
- Best of luck to you. You are talking really expensive hardware now. Not
- something you'd tack onto an Amiga, or anything, since chunky is here now,
- costs less, and doesn't have any of these deficiencies to which your
- special-purpose hardware would address.
-
- > >Rubbish. Show me a piece of code that can do that. More special-case
- > >optimizations? (sigh...)
- >
- > That's not reall "special case". It is _optimization_, you need to use
- > more complicated algorithms with a planar system. I always said: chunky
- > is _simpler_ to use.
-
- Nope. "Special case". You can't use the same algorithm to stretch that
- 32x32 image into, say, 64x57 or shrink it to 17x13. You might be able to
- do 64x32 or 32x64. But that's still a special case optimization.
-
- > >But you're saying chunky->bitplane conversion is a small fraction of the
- > >whole operation. It's not. That ridiculous amount of masking/shifting
- > >might even take longer than the matrix multiplications. For instance, on a
- > >68000, each shift takes *2* additional clock cycles! (yuck!)
- >
- > Then do not use shifts on 68000. Seems that you talk about advantages
- > and disadvantages when using a 68000 processor. Don't forget that an
- > 68000 is too slow in any case. That's why the Amiga has a blitter.
-
- Shifting is too slow on almost any general purpose processor, not just the
- 68000. Irregardless, you need to do a lot of shifting/masking, which isn't
- going to be free on --any-- processor, although custom chips can go a long
- way towards hiding this via pipelining/superpipelining.
-
- > >> bitmap scaling can bit done on a per-plane basis.
- >
- > >As I said, this is 8x the amount of work chunky has to do. (You're doing
- > >it a bit at a time instead of a byte at a time.)
- >
- > YOU may do it that way. I simply use a table that maps several pixels
- > to their scaled counterparts.
-
- Can you use that table to do another transformation, as noted above? No.
- It's a special case optimization.
-
- > >You gave enough examples where special vaporware or special-case
- > >optimizations might make them break even with chunky's results without
- > >any optimization whatsoever.
- >
- > My optimizations (which are possible for nearly any operation) show that
- > bitplanes break even with chunky's results. That's what I was trying to
- > show you. Both are about _same_ speed when handling the same amount of
- > data. But bitplanes can be scaled in depth easily while chunky pixels
- > cannot. This happends to be a slight advantage for chunky when you use
- > 8bit, 16bit or 32bit displays.
-
- I'm not going to argue with you about depth scaling. Given the same depth,
- chunky will be faster for all of the above operations, with exception of a
- few break-even cases. Sometimes it will even be faster on chunky *even
- with a reduced scale for planar*, since chunky doesn't need 'bit
- thrashing'.
-
- > >Oh, yeah. Right. Most graphics operations consist of 32-bit wide
- > >vertical lines, conincidentally placed on a long-word boundary. Most
- > >operations consist of exactly doubling the size of horizontal/vertical
- > >dimension. Most operations consist of blitting data that's perfectly sized
- > >and aligned to a perfectly aligned place on the screen. Yep, that's all of
- > >them. I can't think of any other graphics operations you can do.
- >
- > Most time consuming graphics operations consist of modifying larger areas,
- > drawing diagonal lines or lots of number crunching to generate an image.
- >
- > In the first case bitplanes have an advantage when the image depth is
- > different from the memory architecture and a slight disadvantage when
- > the image directly maps to the memroy architecture (f.e. 8bits/pixel).
- >
- > In the second case you have a major advantage for chunky pixels. But
- > the fact that lines cover less memory than filled areas and some
- > optimizations possible with non-vertical lines make this advantage less
- > important.
- >
- > In the third case the rendering speed isn't important as the calculations
- > outweigh any efficiency differences.
-
- In all cases, you've admitted that chunky is faster than planar (w/
- exception of you special case optimization where it breaks even). I rest
- my case.
-
- > >Suppose you have hundreds, thousands of blits? In general the speed of
- > >blits isn't that critical for most productivity software, but
- > >graphically-oriented software will benefit greatly from chunky
- > >architecture.
- >
- > Not at all.
-
- Do a histogram of a chunky and a planar display. Tell me how many pixels
- of each color are on the screen. Yet another situation where chunky wins,
- bigtime.
-
- > >On WHAT??? Where you live? If you wear underoos? WHAT???!?? <foaming at
- > >the mouth>
- >
- > Think of rendering a 2 color horzontal line (the base case for chunky).
-
- Would help if you didn't delete too much from the above references.
- Plotting an alternating color horizontal line will still be (w/ 8-bit
- color) 8x faster on chunky than planar. Plotting every other pixel a
- different color isn't going to make planar faster.
-
- > >Oh, I see. More special case optimizations. So you SAY it's a 256-color
- > >display- BUT you guarantee the top 2 bitplanes are already clear?
- >
- > I don't say it is a 256 color display. That's the major point of bitplanes.
- > It can be a 2,4,8,16,32,64,128 or 256 color display and the layering software
- > will take into account what you need. Clearing the unused bitplanes (which
- > happens if you have several depths in a single viewport) is done _once_
- > for each newly created layer.
-
- :) Look, if you want a 128-color display, I won't forbid you from using a
- planar architecture. However, drawing a line will be far faster on a
- 256-color chunky than a 128-color planar display. If you need a
- 4,16,256,65536, or true-color display, however, your losses using planar
- will be even more dramatic as the number of colors goes up, since you are
- wasting 2x, 4x, 8x, 16x, or 32x the amount of time accessing irrelevant
- data (still not counting shifting/masking).
-
- [another fragment of agreement - color expansion]
-
- > Regards,
- > --
- > Michael van Elst
- > UUCP: universe!local-cluster!milky-way!sol!earth!uunet!unido!mpirbn!p554mve
- > Internet: p554mve@mpirbn.mpifr-bonn.mpg.de
- > "A potential Snark may lurk in every tree."
-
- Glenn Doiron
- [who thinks planar is fine for dual-playfields in limited video bandwidth,
- but would rather have separated dual-chunky playfields instead.]
- --
- Amiga UUCP+
- Origin: uunet.uu.net!starpt.UUCP!doiron (Organization:68K Software Development)
- BIX: gdoiron
- ** Not enough memory to perform requested operation. Add 4 megs and retry.
-