home *** CD-ROM | disk | FTP | other *** search
- Xref: sparky comp.windows.x.pex:509 comp.graphics:9279 comp.sys.sgi:13058
- Newsgroups: comp.windows.x.pex,comp.graphics,comp.sys.sgi
- Path: sparky!uunet!nntp1.radiomail.net!fernwood!shograf!jimb
- From: jimb@shograf.com (Jim battle)
- Subject: PEX data reformatting issue (was: Future development of PEX/PHIGS)
- Reply-To: jimb@shograf.UUCP (Jim battle)
- Organization: SHOgraphics
- Date: Sun, 30 Aug 1992 11:30:52 GMT
- Message-ID: <1992Aug30.113052.9729@shograf.com>
- Keywords: PEX, OpenGL
- References: <5818@m1.cs.man.ac.uk> <or5vc3k@fido.asd.sgi.com> <1992Aug24.190727.19006@shograf.com> <p1f880s@fido.asd.sgi.com>
- Sender: Jim Battle
- Lines: 99
-
- Concerning the OpenGL vs PEX data reformatting issue, I'd like to reply
- to some of Allen Akin's responses to an earlier posting I'd done, which
- was in response to a posting of his ...
-
- I (jimb@shograf.com) said:
- | If you spend
- | 2us per vertex (to make up a juicy number) calculating forces, spending
- | another 25ns to perform an extra write to memory to build the data
- | structure that PEX wants and another 25ns to read the data item,
- | is insignificant (4%).
- ...
- | Looking at the wide variety of prices, performances, and
- | price/performances on the market, I don't think that this issue
- | will affect things enough to make or break anyone.
-
- Allen Akin (akin@sgi.com) said:
- | Your question implies that you want to move to a more realistic level
- | of detail for the tetrahedral mesh example. To do so, I'd want to know
- | a few more pieces of information. 25ns for a write all the way through
- | to main memory seems low to me; are you assuming RAMBUS or synchronous
- | DRAM technology? What are their access cycles like for random and
- | sequential addresses? How about overhead cycles to acquire and release
- | the main memory bus, and to fill the other words in a cache line?
- ...
- (sample program to test real system bus copy bandwidth)
- ...
- | When I run this on my R3000 Indigo, it reports "14.0M B/s".
-
- I took your program and modified it to get rid of the timer stuff, since
- as a terminal we don't have UNIX-style timer commands. I also had to
- increase the NITERATIONS constant to 500 get a better timing. Anyway,
- our machine (i860 based) measured a copy bandwidth of 52.6 MB/s (gross
- bandwidth of > 100MB/s). Theoretically, the i860XR's bus bandwidth is
- 160MB/s at 40MHz. We have no external cache, only the internal 8K data
- and 4K instruction caches.
-
- I realize that what you are talking about is a client problem, and as
- a terminal maker we can't do anything to address that, but I'm sure
- that if we lost our minds and put out a major software effort we could
- just as well be a UNIX box (or you could strip the disk drive and all the
- UNIX out of an Indigo), so I'll put forth my comment anyway.
-
- I feel that the 3.75x system bus bandwidth difference between a
- ShoGraphics PEXStation and an SGI Indigo would make a bigger difference
- in system performance than having a funky user-accessible DMA capability.
- I agree with you that for some cases the OpenGL approach is a significant
- win for a specific machine, but I remain unconvinced that it matters that
- much in general.
-
-
- As to your specific comments and questions:
-
- | 25ns for a write all the way through to main memory seems low to me
-
- In our case, the 105MB/s rate would imply that we it takes about 40ns
- to write a long (32-bit) value. This is true if all you are doing is
- writing longs out to memory. If you are actually doing some calculation
- to produce your vertex information, the processor can post the write in
- 25ns and get on with life. (If conditions are right, a single instruction
- can push 16 bytes out to the memory interface in 25ns concurrent with
- another internal instruction.) With luck, the write has been processed
- by the time the next vertex is ready to be written out. Likewise, reads
- on an i860 bus are pipelined and more than one request can be outstanding.
-
- | are you assuming RAMBUS or synchronous DRAM technology?
-
- No, we just use the 64-bit i860 interface to ordinary DRAM.
-
- Note that Intel makes a 50MHz i860XP processor; not only is the basic
- bus cycle time faster by 25%, but it supports a burst-mode interface,
- allowing 8-byte transfers to occur on every clock cycle (instead of
- every other cycle on the i860XR). The theoretical bus bandwidth of a
- 50MHz i860XP is 400MB/s.
-
-
- Allen Akin (akin@sgi.com) said in the same article:
- | Even after we settle on these details, it may still turn out that, for
- | the given example on a given machine, data reformatting is a small
- | percentage of total cycles. That's true for some cases, and past
- | experience indicates that it has been false in others. (See comments
- | above.) Since this uncertainty exists, why not choose a 3D graphics
- | interface that allows you to reduce bus bandwidth requirements in the
- | cases where that results in performance or price improvement?
-
- I guess political reasons. As you know, when DEC started the process of
- creating PEX, there was no such thing as OpenGL. If OpenGL had actually
- existed a year ago I'd bet that we'd be shipping OpenGL terminals today.
- But that didn't happen until DEC/E&S/IBM/Sun/HP/Convex and a handful
- of other companies decided to make PEX the open graphics standard. I
- don't think it was a coincidence that SGI announced OpenGL when it did.
-
- Another reason is sample implementation PEX server source code requires
- only ftp accesses. To get (admittedly much more complete, yet still
- probably not commercially viable) OpenGL server source code you need
- to license it from SGI for $100K.
-
-
- Thanks for your response; sorry if I've mangled the content of your
- arguments by excerpting it this way.
-