home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.arch
- Path: sparky!uunet!zaphod.mps.ohio-state.edu!uwm.edu!cs.utexas.edu!wotan.compaq.com!twisto.eng.hou.compaq.com!croatia.eng.hou.compaq.com!leigh
- From: leigh@croatia.eng.hou.compaq.com (Kevin Leigh)
- Subject: COMPAQ PROPOSED SCALABLE I/O ARCHITECTURE
- Message-ID: <1992Dec10.025609.5164@twisto.eng.hou.compaq.com>
- Summary: Cost effective, higher performance, processor-independent I/O scheme
- Keywords: I/O, Point-to-Point, High Performance, Low Cost, Bus
- Sender: news@twisto.eng.hou.compaq.com (Netnews Account)
- Organization: Compaq Computer Corp.
- Date: Thu, 10 Dec 1992 02:56:09 GMT
- Lines: 342
-
- ************************************************************
- * *
- * COMPAQ COMPUTER CORPORATION *
- * HOUSTON, TX *
- * *
- * STRATEGIC TECHNOLOGY DEVELOPMENT GROUP *
- * *
- * We're tired of beating our heads against the wall *
- * trying to expand the performance of I/O buses. Let's *
- * face it, wide-buses are just not a long term cost *
- * effective solution for high performance computer *
- * solutions. So here is an alternative... *
- * *
- ************************************************************
- * *
- * Compaq presented a proposal at the PCMCIA sub-committee *
- * (CardBus) meeting in Deerfield Beach, Florida, on *
- * 12/7/92 for possible adoption as the CardBus standard. *
- * *
- ************************************************************
-
- Our proposed I/O solution is
- - hierarchical
- - point-to-point
- - channel I/O architecture
- - scaleable
- - high performance
- - processor & endian independent
- - low cost
-
- In short, NOT yet-another shared-wide-bus!!!
-
-
- CPU
- |
- |---memory
- MIOC
- / \
- IOC dev3
- / \
- dev1 dev2
-
- Figure-1
-
- The basic I/O subsystem consists of a "main" I/O concentrator
- (MIOC) which interfaces the I/O subsystem to any CPU/Memory
- architecture [figure-1]. One or more point-to-point
- channels propagate in a hierarchical manner from the MIOC to
- the devices. One such device is an I/O Concentrator (IOC)
- which allows the bandwidth of a single point-to-point
- channel to be shared amongst several devices. Other devices
- might include network interfaces, graphics, SCSI, etc. You
- may optionally allow a few very low bandwidth devices to
- share the same channel under a controlled environment.
-
- A channel is an interconnect between two ports, one residing
- in a (M)IOC and the other in a device. A port closer to
- CPU-memory is referred to as an upstream, and a port further
- from CPU-memory is referred to as a downstream. The physical
- channel consists of 12 signals:
- - a 50 MHz synchronizing clock
- - two handshake signals
- - eight data, and
- - a parity
-
- The small number of signals on a channel reduces the pin
- requirements for interface ICs and the physical size of
- add-in boards and connectors, reducing the cost of
- implementation. Because all the data transfers are in byte-
- wide quantities the sequence of data is in increasing address
- order, i.e., there is no little- or big-endianess in the
- data. Also note that the channel pin-out is NOT a
- derivative of any particular CPU chip's memory interface
- signals, allowing the solution to be non CPU architecture
- specific.
-
- The proposed solution minimizes "out-of-band" signals. All
- operations within the I/O subsystem are carried out via
- packets. Even the synchronizing channel clock is propagated
- from the MIOC to the devices through IOCs. In other words,
- the channel clock is redistributed on every IOC's downstream
- port. The signal timings of a channel is independent from
- all other channels. This virtually eliminates signal skew
- problems, especially for the clock, common in today's shared
- bus systems.
-
- The standard data transfer rate on a channel is 50
- MBytes/sec (MBPS) and the upper-bound will be limited by the
- physical environment of the channel. For example, a current
- high performance implemention with GTL (Gunning Transistor
- Logic) drivers and receivers could yield a transfer rate of
- 200 MBPS.
-
- Two ports on a channel "agree" on a transfer rate during the
- initialization. Each port multiplies the 50 MHz channel
- clock to generate internal clocks to operate at the agreed
- transfer rate. Each IOC port can operate at a different
- transfer rate. A device designed to support high transfer
- rates also works at lower transfer rates. That means a
- device can be used at any I/O hierarchy level. The I/O "tree"
- should be carefully configured to optimize the bandwidth
- distributions according to the system and devices requirements.
-
-
- Operations occur using packet-based commands that flow
- through the channels. Each packet generally contains four
- fields
- - command (Read, Write, Error, Extensions/Interrupts, etc.)
- - size (the amount of data to be transferred)
- - address
- - data
-
- Only the command field is mandatory. All ports are
- responsible for some encoding and decoding of packets. Note
- that only one packet at a time can be transferred on a
- channel. This means that there will be times that
- - a devices cannot send or receive when an IOC downstream port
- does not have space or data, respectively, and
- - both ports want to send at the same time.
- The channel protocol supports handshaking for data transfers
- as well as for conflict resolutions.
-
- The commands are designed to allow optimal use of the
- distributed I/O bandwidth and to let devices maintain high
- bandwidths even when latencies are long. For example, the
- read operations are split transactions, where each read
- packet is later responded by a corresponding read-response
- packet.
-
- The proposed solution is designed to allow the CPU to
- directly read and write registers of MIOCs, IOCs, and
- devices in a programmed I/O manner. However, the
- architecture particularly lends itself to Command List
- Processing Master devices. These entities run device-
- specific command contexts placed in system memory that define
- their operation. This potentially frees the main CPU to
- perform other operations.
-
-
- ************************************************************
- * *
- * So, what do you think? We are interested in making *
- * this proposal an open Industry Standard. As such, you *
- * may request further information, give us your comments, *
- * or ask questions. *
- * *
- * *
- * Contact: *
- * *
- * David Wooten, Manager, *
- * Strategic Technology Development *
- * Compaq Computer Corporation, *
- * 20555, SH 249, *
- * Houston, TX *
- * *
- * Email: davidw@twisto.compaq.com *
- * Phone: (713)378-7231 *
- * Fax: (713)374-2580 *
- * *
- * *
- * *
- * CONTINUE READING IF YOU'RE INTERESTED IN WHY WE SPENT *
- * TIME DEVELOPING THIS CONCEPT!!! *
- * *
- ************************************************************
-
- The question we asked ourselves was how can one build an I/O
- subsystem that is high performance, easy to interface, easy
- to expand, and inexpensive?
-
- A shared wide-bus is not the solution due to its inherent
- limitations, such as distributed capacitance, one-at-a-time
- transmit, and high pin count for the mother board and add-in
- cards.
-
- In the past, shared bus architectures made sense because of
- the TTL technology and the high-cost-and-low-integration of
- silicon. The PC architecture that started almost a decade
- ago with a single, shared, 8-bit memory-I/O bus (known as
- "ISA") has evolved into many variants due to the ever-
- escalating performance requirements of x86 based CPU
- systems. This phenomenon is also seen in other system
- architectures based on non-x86 CPUs.
-
- One of the first changes in PC evolution was to move the CPU
- and the memory to a faster proprietary local bus when the
- ISA bus became a bottleneck. The I/O data bus was widened
- to 16-bit and then to 32-bit to provide higher bandwidth
- for some devices. There were two PC "wide-bus" solutions:
- EISA, a super-set of ISA for backward compatibility, and
- Micro Channel (MCA).
-
- EISA system is more flexible than ISA (and MCA) system
- because it can accept either EISA or ISA card. Also, EISA
- cards in EISA systems can provide higher performance than
- ISA cards in ISA (or EISA) systems. EISA has served quite
- well for several high-performance devices (e.g., Compaq's
- high performance QVision graphics card). As a tradeoff
- for the higher bandwidth, the system and the add-in cards
- for the wider I/O buses are more expensive than comparable
- ISA solutions. Because ISA can provide adequate bandwidth
- for the majority of devices, several board vendors continue
- to build more boards for ISA because of the larger available
- market.
-
- More recently, color graphics and video applications have
- become popular for PCs. In addition, high pixel
- resolution and more bits-per-pixel (for color) displays
- have become more affordable. Consequently, the bandwidth
- demands for the graphics/video devices have become too
- large for even EISA and MCA. To correct this deficiency,
- several system OEMs have moved graphics onto higher
- bandwidth proprietary local buses. As the practice of
- moving faster devices onto the local bus became common,
- there was a need for a standard. Currently, there are
- two local bus standard proposals, namely, PCI (by Intel)
- and VL-Bus (by VESA).
-
- Existing processors do not have a PCI bus and future
- processors will not directly support the VL-Bus. Besides,
- PCI and VL have limited plug-in (card) support.
- Consequently, PC systems utilizing PCI or VL may have
- multiple levels of buses (CPU bus, local bus, standard I/O
- bus) and IC bridges to interface between different bus
- protocols. All of this leads to higher system and add-in
- board costs. Chip and board vendors also need to choose
- between many different interfaces and form-factors to
- support a wide range of system types.
-
- Worse yet, both PCI and VL buses will very likely try to
- satisfy future needs, such as higher bandwidth and 3.3V
- technology, by CHANGING the physical layer. Examples
- might include: a wider (64-bit) bus once 32-bits runs out of
- steam, and a different driver/receiver technology for higher
- frequencies. When a 32-bit bus is expanded to 64-bit, a
- high-bandwidth (e.g., video) device will not gain
- performance on the 64-bit bus unless it is redesigned for
- 64-bit transfers. Protocols might also have to change and
- many of these changes will render today's PCI and VL devices
- incompatible (PCI does have a compatible transition to 64-
- bits). The additional drivers/receivers and new design will
- make the new devices to be more expensive. The trick is to
- allow the aggregate system bandwidth to go up without having
- to redesign everything.
-
- Neither PCI nor VL supports ease-of-use (hot-insertion and
- plug-and-play) features. Depending on implementation,
- PCMCIA does not support hot-swap and may not offer enough
- performance for some applications. A PCMCIA sub-committee
- was created to come up with a PCMCIA super-set called
- CardBus. As mentioned earlier, board vendors made more ISA
- cards than EISA cards for cost reasons and larger market share.
- Similarly, we are guessing that, board vendors will build
- more PCMCIA cards than CardBus cards, if CardBus is a
- super-set of PCMCIA, i.e., a wider and more expensive 32-bit
- bus and backward compatible.
-
- To summarize: the PC industry has tried several times to
- solve increasingly higher performance I/O requirements by
- utilizing a multitude of wide, shared buses. Most solutions
- did not last long because they were not scaleable and they
- were EXPENSIVE. As a result most PC users continue to buy
- cheap and reasonable performance solutions. What the
- computer industry needs is a solution that:
- (a) solves today's problems (performance, expansion and
- ease-of-use) in a cost effective manner AND
- (b) offers a migration path for those who want it AND
- (c) is processor independent AND
- (d) will provide longevity.
-
- Our proposed solution can offer ALL of these properties!
-
-
- ************************************************************
- * *
- * To summarize the major Features: *
- * *
- ************************************************************
-
- Our proposed solution is scaleable in multiple dimensions:
- performance, expandability and cost. Specifically
-
- FOR PERFORMANCE:
- (a) the CPU-memory bandwidth is decoupled from the slower
- device latencies,
- (b) point-to-point minimizes the physical constraints and
- GTL enables higher signal rates than TTL,
-
- FOR EXPANDABILITY:
- (c) virtually unlimited number of devices can be connected
- on-board or off-connectors,
- (d) expansion can also be made in external boxes via a small
- pin-count robust protocol (e.g., a fast serial link)
- (e) hot plug-and-play is supported,
-
- FOR COST:
- (f) small packages/connectors/boards and high integration
- can be achieved because of the small number of pins to
- interface,
- (g) the number of PCB layers can be minimized because of
- "clean" layout and no long traces (even in the case of
- busing, the fanouts are fairly short),
- (h) development cost can be reduced by utilizing common
- functional blocks (e.g., state machines, FIFOs),
- (i) time-to-market can be reduced by integrating common
- parts,
- (j) common parts for a wide range of systems enable large
- volume, fewer inventory part types and consistent
- test/manufacturing techniques,
-
- FOR COMPATIBILITY:
- (k) standard I/Os such as ROM and keyboards can be tapped
- off of an MIOC or IOC,
- (l) existing standard buses (e.g., ISA, PCMCIA) can also be
- implemented off MIOC or IOC,
- (m) Existing application software and operating systems will
- be compatible with some driver updates.
-
- FOR LONGEVITY:
- (n) the same devices can be reused in future systems because
- they are processor-neutral,
- (o) the same I/O subsystem or architecture can be used in a
- family of products (portables, desktops, workstations,
- servers),
- (p) the same device can be used for different transfer rates
- without changing the physical interface, and
- (q) changing a device interface in the future on a channel
- does not affect the rest of the devices [we call this
- "damage control"].
-
- ************************************************************
- * *
- * Thanks for reviewing our thoughts! Feel free to *
- * to contact us for more information, we're here to *
- * help!!! *
- * *
- * Happy Holidays from (Strat. Tech. Dev.) *
- * David Wooten, Kevin Leigh, Reynold Starnes, *
- * Thanh Tran, Chris Simonich, Brett Costly, *
- * David Murray, Craig Miller, Roger Tipley *
- * *
- ************************************************************
-