NetNews Usenet Archive 1992 #31

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #31 / NN_1992_31.iso / spool / comp / sys / amiga / programm / 18062 < prev next >

Wrap

Internet Message Format | 1993-01-02 | 12.2 KB

Path: sparky!uunet!olivea!bunker!nuconvex!starpt!doiron From: doiron@starpt.UUCP (Glenn Doiron) Newsgroups: comp.sys.amiga.programmer Subject: Re: Chunky Pixels vs. Bitplanes (was: Chunky Chip Set...) Message-ID: <doiron.0kil@starpt.UUCP> Date: 3 Jan 93 00:16:36 GMT References: <Karsten_Weiss.0n2o@ibase.stgt.sub.org> <1hbngoINNglt@uwm.edu> <1993Jan1.141207.20262@mpifr-bonn.mpg.de> Organization: 68K Software Development Lines: 247 X-NewsSoftware: GRn 1.16f (10.17.92) by Mike Schwartz & Michael B. Smith In article <1993Jan1.141207.20262@mpifr-bonn.mpg.de> mlelstv@speckled.mpifr-bonn.mpg.de (Michael van Elst) writes: > In <doiron.0kd5@starpt.UUCP> doiron@starpt.UUCP (Glenn Doiron) writes: > >> > >> Never said that you get an advantage in this situation. Here it is a tie. > > >It's a tie if you're using special hardware, with a 32-point wide vertical > >line which happens to lie entirely within the span of a single longword. > >Wow, it was a tie. Now, how about some more realistic examples? > > We were talking about blitting a rectangle. ... and you still haven't convinced me that chunky isn't faster. As was said before, the best you can hope for is break-even, and far more often chunky will be faster - the bigger the blit, the smaller the difference. > >> >> Simple. The display hardware to _show_ bitplanes needs additional > >> >> shiftregisters and buffer which makes the hardware more expensive. > >> > >> >Nope. RAMDAC's show the display, buddy. Not the TI chip. And getting a > >> >planar display out of VRAM is a trick I'd like to see... well it could be > >> >done, if you want to store one plane in each VRAM, then muck about with the > >> >hardware, majorly. > >> > >> Sure, RAMDAC's generate the analog signal. RAMDAC's however use chunky pixels > >> (they need all bits of a pixel to generate the analog signal for the pixel). > > >So what, planar systems still need all bits of a pixel to generate the > >pixel, too. Your point? > > Read again. I said one would need additional shift registers and buffers > for the display hardware. You denied telling that the RAMDAC would do that. > I pointed out that all RAMDACS await chunky pixels which can't be delivered > directly by a plane oriented memory system. So you need _shift registers_ > and _buffers_. OK. Point granted, I took that statement the wrong way. > >> So if you have a plane oriented memory (which is no real problem with VRAMs > >> although you may get some alignment restrictions) then you have to convert > >> it in hardware. This needs _shiftregisters_ and _buffers_. An example for > >> such a hardware is the Denise and the Lisa chip (both also have a priority > >> logic and Denise includes the RAMDAC). > > >... which use DRAM's, not VRAMS. No amount of 'alignment restrictions' is > >going to cut it for VRAM planar, unless, as I said, you do funky > >address/data schemes and put one plane in each VRAM, then mess further with > >your RAMDAC feeding setup. Using VRAMS like DRAMS will result in a loss of > >all the performance gains VRAMS were supposed to give you in the first > >place. > > Rubbish. VRAMs simply give you a second I/O port for sequential output. > The advantage is that you all the display refresh accesses do not stall > the first port used by the rendering processor (CPU, blitter, what else). > This gives you twice the bandwidth at 130% the cost for this special > application. As I said, you lose all the performance gains VRAMS were supposed to give you in the first place, since you can't use the serial registers for video fetch in planar setups. > >> >> 32x32 pixels (well-aligned) with 8 bitplanes need 2x32x8 fetches and stores. > >> >> (2 words per line, 32 lines, 8 planes). In chunky you have 16x32 fetches > >> >> and stores (16 wrods per line, 32 lines). > >> > >> >Once again, you are considering the best, break-even case, where all your > >> >data is perfectly sized and aligned. If you're going to be doing just > >> >that, why bother with bitmap displays at all, when a character based > >> >display will do just as good? > >> > >> Because we do not talk about _text_ displays. You are right that we get > >> some inefficiencies with non-aligned data. But so with chunky pixels > >> if your memory isn't pixel oriented (f.e. if the memory is 16bit wide > >> you "waste" bandwidth on odd addresses). > >> > >> Thus bitplane devices are usually same speed as chunky pixel devices. But > >> you can scale bitplane devices easily with the obvious performance advantages. > >> You can never scale chunky pixel devices. > > >"Scale"?? Remember, scaling is at least (in this example) 8x slower in > >planar than chunky. > > I'm talking about "scaling" the display to different display depths. Chunky has this property too, although only to a power of two. This however, is not nearly good enough justification for all the penalties you get in planar doing everyday operations. Why use a lesser number of colors when you can use more for the same, if not better, performance? > And as I also pointed out, bitmap scaling (that's what you thought of) > is about same speed for chunky and bitplanes. No. It is not. You are doing N time the amount of work in planar, where N is the number of planes. > >> >Oh, obviously. If you can make this magical piece of hardware, I'm sure > >> >Commodore has a job position open for you. Or any other number of chip > >> >designers. > >> > >> I could probably make that from discrete parts of fpgas but I have no > >> experience in doing asics or even full custom designs. > > >Why hasn't someone done it already? Musn't be very cost-effective. > > Correct. Bitplanes have the additional burden to feed a RAMDAC still > with chunky pixels. Read what I said above. If someone came out with > a RAMDAC that includes the necessary logic the price difference would > become much smaller. Best of luck to you. You are talking really expensive hardware now. Not something you'd tack onto an Amiga, or anything, since chunky is here now, costs less, and doesn't have any of these deficiencies to which your special-purpose hardware would address. > >Rubbish. Show me a piece of code that can do that. More special-case > >optimizations? (sigh...) > > That's not reall "special case". It is _optimization_, you need to use > more complicated algorithms with a planar system. I always said: chunky > is _simpler_ to use. Nope. "Special case". You can't use the same algorithm to stretch that 32x32 image into, say, 64x57 or shrink it to 17x13. You might be able to do 64x32 or 32x64. But that's still a special case optimization. > >But you're saying chunky->bitplane conversion is a small fraction of the > >whole operation. It's not. That ridiculous amount of masking/shifting > >might even take longer than the matrix multiplications. For instance, on a > >68000, each shift takes *2* additional clock cycles! (yuck!) > > Then do not use shifts on 68000. Seems that you talk about advantages > and disadvantages when using a 68000 processor. Don't forget that an > 68000 is too slow in any case. That's why the Amiga has a blitter. Shifting is too slow on almost any general purpose processor, not just the 68000. Irregardless, you need to do a lot of shifting/masking, which isn't going to be free on --any-- processor, although custom chips can go a long way towards hiding this via pipelining/superpipelining. > >> bitmap scaling can bit done on a per-plane basis. > > >As I said, this is 8x the amount of work chunky has to do. (You're doing > >it a bit at a time instead of a byte at a time.) > > YOU may do it that way. I simply use a table that maps several pixels > to their scaled counterparts. Can you use that table to do another transformation, as noted above? No. It's a special case optimization. > >You gave enough examples where special vaporware or special-case > >optimizations might make them break even with chunky's results without > >any optimization whatsoever. > > My optimizations (which are possible for nearly any operation) show that > bitplanes break even with chunky's results. That's what I was trying to > show you. Both are about _same_ speed when handling the same amount of > data. But bitplanes can be scaled in depth easily while chunky pixels > cannot. This happends to be a slight advantage for chunky when you use > 8bit, 16bit or 32bit displays. I'm not going to argue with you about depth scaling. Given the same depth, chunky will be faster for all of the above operations, with exception of a few break-even cases. Sometimes it will even be faster on chunky *even with a reduced scale for planar*, since chunky doesn't need 'bit thrashing'. > >Oh, yeah. Right. Most graphics operations consist of 32-bit wide > >vertical lines, conincidentally placed on a long-word boundary. Most > >operations consist of exactly doubling the size of horizontal/vertical > >dimension. Most operations consist of blitting data that's perfectly sized > >and aligned to a perfectly aligned place on the screen. Yep, that's all of > >them. I can't think of any other graphics operations you can do. > > Most time consuming graphics operations consist of modifying larger areas, > drawing diagonal lines or lots of number crunching to generate an image. > > In the first case bitplanes have an advantage when the image depth is > different from the memory architecture and a slight disadvantage when > the image directly maps to the memroy architecture (f.e. 8bits/pixel). > > In the second case you have a major advantage for chunky pixels. But > the fact that lines cover less memory than filled areas and some > optimizations possible with non-vertical lines make this advantage less > important. > > In the third case the rendering speed isn't important as the calculations > outweigh any efficiency differences. In all cases, you've admitted that chunky is faster than planar (w/ exception of you special case optimization where it breaks even). I rest my case. > >Suppose you have hundreds, thousands of blits? In general the speed of > >blits isn't that critical for most productivity software, but > >graphically-oriented software will benefit greatly from chunky > >architecture. > > Not at all. Do a histogram of a chunky and a planar display. Tell me how many pixels of each color are on the screen. Yet another situation where chunky wins, bigtime. > >On WHAT??? Where you live? If you wear underoos? WHAT???!?? <foaming at > >the mouth> > > Think of rendering a 2 color horzontal line (the base case for chunky). Would help if you didn't delete too much from the above references. Plotting an alternating color horizontal line will still be (w/ 8-bit color) 8x faster on chunky than planar. Plotting every other pixel a different color isn't going to make planar faster. > >Oh, I see. More special case optimizations. So you SAY it's a 256-color > >display- BUT you guarantee the top 2 bitplanes are already clear? > > I don't say it is a 256 color display. That's the major point of bitplanes. > It can be a 2,4,8,16,32,64,128 or 256 color display and the layering software > will take into account what you need. Clearing the unused bitplanes (which > happens if you have several depths in a single viewport) is done _once_ > for each newly created layer. :) Look, if you want a 128-color display, I won't forbid you from using a planar architecture. However, drawing a line will be far faster on a 256-color chunky than a 128-color planar display. If you need a 4,16,256,65536, or true-color display, however, your losses using planar will be even more dramatic as the number of colors goes up, since you are wasting 2x, 4x, 8x, 16x, or 32x the amount of time accessing irrelevant data (still not counting shifting/masking). [another fragment of agreement - color expansion] > Regards, > -- > Michael van Elst > UUCP: universe!local-cluster!milky-way!sol!earth!uunet!unido!mpirbn!p554mve > Internet: p554mve@mpirbn.mpifr-bonn.mpg.de > "A potential Snark may lurk in every tree." Glenn Doiron [who thinks planar is fine for dual-playfields in limited video bandwidth, but would rather have separated dual-chunky playfields instead.] -- Amiga UUCP+ Origin: uunet.uu.net!starpt.UUCP!doiron (Organization:68K Software Development) BIX: gdoiron ** Not enough memory to perform requested operation. Add 4 megs and retry.