home *** CD-ROM | disk | FTP | other *** search
- Path: sparky!uunet!olivea!bunker!nuconvex!starpt!doiron
- From: doiron@starpt.UUCP (Glenn Doiron)
- Newsgroups: comp.sys.amiga.programmer
- Subject: Re: Chunky Pixels vs. Bitplanes (was: Chunky Chip Set...)
- Message-ID: <doiron.0ka3@starpt.UUCP>
- Date: 30 Dec 92 22:05:30 GMT
- References: <Karsten_Weiss.0n2o@ibase.stgt.sub.org> <1hbngoINNglt@uwm.edu> <1992Dec30.115759.22097@mpifr-bonn.mpg.de>
- Organization: 68K Software Development
- Lines: 203
- X-NewsSoftware: GRn 1.16f (10.17.92) by Mike Schwartz & Michael B. Smith
-
- In article <1992Dec30.115759.22097@mpifr-bonn.mpg.de> mlelstv@speckled.mpifr-bonn.mpg.de (Michael van Elst) writes:
- > In <doiron.0k7v@starpt.UUCP> doiron@starpt.UUCP (Glenn Doiron) writes:
- > >> In <doiron.0k4a@starpt.UUCP> doiron@starpt.UUCP (Glenn Doiron) writes:
- > >> > 32-point line,
- > >> > 2 pixels thick: 512 reads 512 writes 64 writes
- > >> > (Why are bitplanes better for this? Seems like an additional 960 memory
- > >> > accesses to me.)
- > >>
- > >> It would be still 256 reads 256 writes unless the line spans over a word
- > >> boundary. More, shallow lines will need less reads and writes.
- >
- > >the same word boundary", you are talking special hardware. BTW you are
- > >still confusing the issue; you are comparing a 32-bit cpu access planar system
- > >to an 8-bit cpu access chunky system.
- >
- > A 32 point line with 8 bitplanes needs 32 * 8 reads and writes _worst_case_.
- > A two pixel wide line will need the same if all pixels are in one memory word
- > (bytes, words,longwords whatever).
-
- Once again, you are talking about special hardware. Even then, chunky can
- have the same optimization. Where is this awesome planar advantage?
- Chunky is still going to take 32 writes, max.
-
- > Where do I refer to a 32bit CPU ??? Where do I even _talk_ about a CPU ?
-
- Well, I was referring to your double-pixel line timings. Once again, I
- wasn't taking into effect your special hardware.
-
- > >> However, your line drawing routine will need cycles itself. So unless
- > >> your rendering is done in hardware the difference gets smaller.
- >
- > >TMS340x0's have Bresenham's line-drawing algorithm embedded in hardware.
- > >They also support chunky-pixel displays. They don't support planar
- > >displays. I wonder why. Oops, I strayed from the topic too, sorry...
- >
- > Simple. The display hardware to _show_ bitplanes needs additional
- > shiftregisters and buffer which makes the hardware more expensive.
-
- Nope. RAMDAC's show the display, buddy. Not the TI chip. And getting a
- planar display out of VRAM is a trick I'd like to see... well it could be
- done, if you want to store one plane in each VRAM, then muck about with the
- hardware, majorly.
-
- > >> > 32x32 blit: 1536 reads 512 writes 1024 reads, up to 1024 writes
- > >> > or up to 2304 reads 768 writes
- > >> > (second case if either source or dest. non-aligned)
- > >> > (including transparency/masking)
- > >>
- > >> 32x32 blit is 512 reads/writes. more for non-aligned objects. chunky
- > >> needs 512 reads/writes and a little more for non-aligned objects.
- > >(I was using the Amiga's blitter as a reference here)
- > >> (Both do word accesses).
- >
- > >Here you go again. First you're talking about 16-bit accesses, now 32-bit
- > >accesses. Please compare planar 32-bit vs. chunky 32-bit, not planar
- > >32-bit vs. chunky 8-bit, as you seem oblivious to the differences. In any
- > >case, the planar system will almost always lose for two reasons:
- >
- > 32x32 pixels (well-aligned) with 8 bitplanes need 2x32x8 fetches and stores.
- > (2 words per line, 32 lines, 8 planes). In chunky you have 16x32 fetches
- > and stores (16 wrods per line, 32 lines).
-
- Once again, you are considering the best, break-even case, where all your
- data is perfectly sized and aligned. If you're going to be doing just
- that, why bother with bitmap displays at all, when a character based
- display will do just as good?
-
- > I didn't ever say anything about 32bit accesses not do I refer to 8bit
- > chunky accesses. Both used 16bit transfers.
- >
- > > 1 Chunky doesn't need to do a write if this is a transparent pixel.
- > > 2 Chunky doesn't lose time if the data is not perfectly/wholly aligned.
- >
- > Chunky _needs_ to write transparent pixels and _needs_ to care about
- > alignment _if_ you do memory access larger than a pixel. You either
- > have to use _8bit_ accesses or use the complete buswidth with the
- > same constraints as in bitplanes.
-
- Do you know what data port sizes are? Even if that was the case, worst
- case chunky could be blowing, say, 3/4 of it's time because only one pixel
- was needed. Planar will be wasting 31/32 of it's time worst case.
-
- > >Here's the point: <<EACH SYSTEM HAS TO MOVE THE SAME AMOUNT OF DATA>>,
- > >but <<PLANAR MUST AFFECT ANY ADJACENT PIXELS IN THE BOUNDARY WHILE CHUNKY
- > >DOES NOT>>. (once again ignoring the shifting/masking aspects which you
- > >seem to patently ignore as if they were of no consequence at all)
- >
- > Shifting/Masking is a no-time operation in hardware. If you talk about
- > general purpose CPUs then please account for instruction fetches as well.
-
- Oh, obviously. If you can make this magical piece of hardware, I'm sure
- Commodore has a job position open for you. Or any other number of chip
- designers.
-
- > >So how is doing 8x the number of mathematical transformations possibly
- > >going to be faster, not to mention the masking/bit shifting? And 'table
- > >look-up' isn't anywheres close to general purpose. Once again, you're
- > >talking about a special-case optimisation, one that can also be done with
- > >chunky pixels just as well.
- >
- > You were refering to the _special_ case of expanding a bitmap by the factor
- > of 2. I gave an example.
- >
- > You also do not use 8x the number of mathematical transforms.
-
- You are doing the same transformations 8 times, once for each bitplane.
- This is not 8x the number of transformations? You deny facts?
-
- > There is either the application of having numerical data to transform
- > and display (remember: bitplanes are just for _display_ operations).
- > There you need the chunky->bitplane conversion but which is only
- > a small fraction of the _whole_ operation.
-
- With this highly magical vaporhardware you keep falling back on,
- chunky->bitplane conversion should take no time at all!
-
- > And when operations in the bitplane domain are feasible (i.e. when the
- > numerical _value_ of a pixel is of no concern)
-
- huh? How about an example to un-confuse me?
-
- > then you can operate
- > on individual bitplanes and get _multiple_ pixels per memory fetch
- > which _exactly_ outweighs the factor of havint _multiple_ bitplanes.
- > As I said, in the end the memory bandwidth determines the speed of
- > your operation and bitplanes do not use more memory than chunky pixels.
-
- I don't deny that bitplanes don't use more memory than chunky pixels, it's
- just that 90% of all operations are going to be faster on chunky pixels.
-
- > >Planars's losses go up directly as your blit gets larger. The overhead for
- > >the mathematical xforms is the same, but with what you're saying, planar
- > >will take N times longer, where N is the number of bitplanes.
- >
- > Which is complete garbage. In bitplanes you write your transforms to handle
- > several pixel bits at _once_.
-
- Some magical software to go with that magical hardware? Or more
- special-case optimizations?
-
- > >Bitplanes will be faster for 12.5% whenever you have data that is perfectly
- > >aligned, perfectly sized, and special hardware to do those shifting and
- > >masking operations which you seemingly dismiss as taking no cpu-time
- > >whatsoever {evil grin}. Unfortunately, this is seldom the case, and even
- > >in 7-bit color chunky will win in most of the operations being discussed.
- >
- > Why do you deny the facts ?
-
- I don't. However, you seem oblivious to the drawbacks of pixel-based
- operations in a planar display. Even blit operations will almost always
- take more time on planar displays.
-
- > >Now, if you are talking about copying entire screens of data, like ANIM,
- > >yes, planar can take that 12.5% speedup. But most graphic operations will
- > >do much better on planar than chunky.
- >
- > Correct :) Most graphic operations do much better on planar than chunky :)
-
- Tip of the slounge :)
-
- > >Tell you what. Code a routine that
- > >will plot a line from (X1,Y1) to (X2,Y2) in color C on an 8-bitplane 32-bit
- > >access system. Now do the same thing for an 8-bit chunky 32-bit access
- > >system. Look at the code. Tell me which one is doing (a *hell of a lot*)
- > >more calculations. Which one accessing video memory (a *hell of a lot*)
- > >more often. Now, tell me which one is going to run faster on *any*
- > >machine?
- >
- > As I said, drawing a single pixel wide line wins on chunky.
-
- Wow... agreement.. (even if slightly incorrect)
- However, drawing a n-pixel wide line also wins on chunky (where n>0 :)
-
- > Now render a 64 color image to your 256 color display :)
-
- See the above mentioned case of blits. The best you can hope for is a tie,
- since with planar you have to clear the top two bitplanes anyways.
-
- > Or render
- > a black/white text in your window on a 256 color desktop.
-
- See above. You still have to clear the other bitplanes. And some chunky
- hardware supports color expansion, too, where you take a single-bit image
- and can expand all 0's to one color and 1's to another. Of course, a
- similar thing could be done for planar, but you still have the
- masking/shifting/bandwidth on unaffected pixels problems. (Say it 3 times
- real fast.)
-
- > Regards,
- > --
- > Michael van Elst
- > UUCP: universe!local-cluster!milky-way!sol!earth!uunet!unido!mpirbn!p554mve
- > Internet: p554mve@mpirbn.mpifr-bonn.mpg.de
- > "A potential Snark may lurk in every tree."
-
- Glenn Doiron
- (yes, C.A., it's fun, because he's on my own turf, and I WILL fight with
- unarmed people)
- --
- Amiga UUCP+
- Origin: uunet.uu.net!starpt.UUCP!doiron (Organization:68K Software Development)
- BIX: gdoiron
- ** Not enough memory to perform requested operation. Add 4 megs and retry.
-