InfoMagic Standards 1993 July

home *** CD-ROM | disk | FTP | other *** search

/ InfoMagic Standards 1993 July / Disc.iso / ieee / p1394 / min_mail / mail9206.txt < prev next >

Wrap

Text File | 1992-07-19 | 24.7 KB | 532 lines

Date: Mon, 15 Jun 1992 00:24 -0800 (PST) From: DBG@SLACVM.SLAC.Stanford.EDU Subject: Block read 'simplification' is dangerous To: p1394@Sun.COM peter.bartlett@ftcollins.ncr.com said on Thu, 11 Jun 92 10:53:57 MDT : >Subject: Block read requests >I'd like to propose a minor change to the P1394 transaction layer to make >implementations a bit easier. > >Currently, when a P1394 device sends a block read request packet, the data >length encoded in the packet cannot exceed the maximum supported by the >responder. For example, if my device supports 128 bytes of data and I'm >talking to a device which only supports 33, I have to break up my request >for 128 bytes into, say, three 33-byte requests and a 29-byte request. >This makes implementation in hardware or software slower and more complex. > >It would be nice if I could just request all 128 bytes at once, and accept >the partial responses as they arrive. Not only does this make the process >of requesting data much simpler, it reduces bus traffic by eliminating some >read request packets altogether. > >... >All of this makes long block reads much easier, but it doesn't do anything >for block writes. A device still has to know the maximum block length of >the device it is writing to. If a way around this could be found as well, >life would become almost too simple. This is one of those things that seems like a good idea at first blush, but has hidden pitfalls that will haunt you later. It breaks the request/response pairing that is assumed in present protocols. At first it seems like one could fix that by making the last transfer of a series comprising a block special, so one could recognize that one as the end, and pair that one against the request for timeout purposes. However, SerialBus is supposed to be a 1212 compliant bus, easily interfaceable to other 1212-compliant buses. One implication of this general interconnection is that the order of packet delivery is not guaranteed. Thus the 'final' response piece might arrive before an intermediate piece, so an accounting system is required to keep track of the pieces. A new response sub-sequence number will be needed at a minimum, with hardware to account for the pieces as they arrive. Fixing the out-of-order delivery problem is not a reasonable solution, as it places very costly constraints on the packet delivery system. If there is still a strong desire for this feature after thinking it over again, we will have to go into the problems in some detail and add the necessary support to the packet formats and protocols. It will have to lower the interface costs a lot to justify the effort that will be required. Dave Gustavson ====================================================================== From: pete bartlett <sybase!nike!bartlett@Eng.Sun.COM> Subject: More dangerous commentary To: p1394 reflector <p1394@Sun.COM> Date: Tue, 16 Jun 92 9:02:01 MDT I'm not convinced that ensuring packet delivery order across 1212-compliant buses is as complex as has been suggested. Consider a P1394 bus attached via a bridge to another 1212-type bus. As currently defined, each P1394 device must know the maximum packet size which can be accepted by every device it might wish to talk to on the other bus. It could determine these following each bus reset by polling all devices. This seems like a rather lengthy process, especially when a large number of buses are interconnected. Moreover, the requesting device must be aware of packet size limitations of the bridge device itself. If we have a devices on either side of the bridge which can handle 512-byte packets, but the bridge only supports 128 bytes, the requester can only ask for 128 bytes at a time. In this scheme, ordering is ensured by attaching a 6-bit transaction label to each request/response packet pair. Thus, responses can be easily matched to the corresponding request. As an alternative, the bridge device itself could perform the packet size translation. In the situation described above, the requester would ask for 512 (or more) bytes of data. This would be split by the bridge into 128-byte maximum requests on the other bus. Ordering is ensured by the fact that these requests are made one at a time, and therefore get returned as partial responses to the requester in order. All this uses a simple scheme similar to that described in my last, much-maligned EMAIL. This method eliminates the requirement that P1394 devices have knowledge of the packet size hardware constraints of any device they read from. In any case, I'm not insisting that we do this. My real purpose was simply to see whether anyone out there could solve the analogous problem with the write case. Unless this is accomplished, the read solution only adds complexity. The only possibility I can see is to rework the way writes are done. The block write would become a "no data" packet, and the responder would come back with something that looks much like a read request. Since the requester has the data for the write ready, the responder would get a unified transaction. Thus the requester would not need to know how many bytes it could send the responder in a single packet, since the responder would control the data flow. Again, this probably creates more problems than it solves, and I'm withdrawing my request to change the P1394 transaction layer. If anyone would like to continue working on this, let me know. Pete Bartlett NCR Corp. ====================================================================== Date: 16 Jun 92 09:44:51 U From: "Michael Teener" <michael_teener@gateway.qm.apple.com> Subject: Re: Multi-bus P1394 systems To: p1394@scsi.Eng.Sun.COM Message Subject: RE>Multi-bus P1394 systems About Ken's questions: >OK, lets see if I understand Mike Teener's and Dave James's assertion that >arbitration does not cross bus boundaries. This would mean that each bus goes >through its own local arbitration sequence independently of other busses. >Somebody wins arbitration and sends out a packet with an address--containing >the bus number, among other things. Somewhere there is a bridge that connects >the two busses. The bridge watches for address on one bus that belong to >another bus and forwards packets accordingly. >Question: does the bridge become a responder on the originating bus, or is >the ultimate target the only responder? The ultimate target is the only responder *at the transaction layer*. At the link layer, all intermediate bridges respond with a "pending" acknowledge (or a link-layer error such as "busy"). >Question-2: how do the nodes on one bridge learn about the existence of >devices on the other bus? Should the bridge announce the existence of all >the nodes on the other bus at one time(arbitrate once), or should it announce >them one node per arbitration? This *is* the problem in multi-bus systems: management. There is supposed to be a study group started up some time this year to work on IEEE 1212-based bus bridging mechanisms. Are you volunteering? Dan O'Connor (doconnor@apple.com) was at one time interested in leading the effort, but now he's running off to grad school to add more initials to his name. >Question-3: Should the busses be bridged together in daisy-chain fashion, or >should every bus have a bridge to every other bus? See above. As a personal interest, I would would like to ensure that bus bridges are really defined as "half-bridges" that can be connected together using an arbitrary communication mechanism, and the bridging work be scalable to a variety of latencies from LAN-like down to direct hardware full bridges. >Ken Stewart Mike Teener ====================================================================== From sybase!ncrcom!nike.ftcollins.ncr.com!bartlett@Eng.Sun.COM Fri Jun 26 13:51: 59 1992 Status: RO As discussed in the Minneapolis meeting, SGS/Inmos' DS-link signalling mechanism provides several advantages over a simple data/clock scheme. Instead of clock and data signals (each differential pairs in P1394) DS-links have "Data" and "Strobe" signals. When transmitting unencoded data, the D signal contains the same information as the old scheme. The S signal, however, only changes when the D does not. Therefore one and only one signal is changing every bit time (as opposed to one or two with the data/clock.) Also, signal changes are a full bit time apart, where in the old scheme the data changed half a bit time before and after changes on the clock. As I said, DS-links have several advantages over the data/clock scheme. Two of these are particularly relevant to P1394. First, the skew tolerance on the bus is doubled. Instead of having a half-clock of data setup and a half-clock of hold, we have two signals which change a full clock apart. They can still be decoded correctly even if they skew more than half a clock toward or away from each other. Second, the physical interface chips can have an internal clock running at only half the old rate. This is a direct result of the fact that signals on the bus need only change every bit time, rather than twice per bit time. I know some of the working group members couldn't care less about this (since it's a chip issue) but the fact is it will allow us to create devices which implement the 200 and 400 Mbit/sec rates at a much earlier date. Other advantages (which I'm not quite as concerned with) include fully autobaud operation, and dynamic skew adjustment. As a chip designer, I can confidently state that the cost of devices using DS-links signalling will be no greater than if they used the data/clock method. The gate-level changes are quite minor. The only cost issue remaining is the SGS licensing fee. I think it has been reasonably well established that, for unencoded data, the DS-links signalling represents a significant improvement over the old data/clock method. This is comparing the proverbial apples to other proverbial apples. I think there was some confusion at the meeting because some members were instead comparing apples (unencoded DS-links) to oranges (encoded clock/data), and therefore finding disadvantages in the DS-links scheme. In order to compare oranges to oranges, we must examine if the data on the DS-links can be encoded to provide the same advantages as on the old method. These advantages are DC balance and embedded clocking. I'm no expert on data encoding; in fact I'm not familiar with the specifics of IBM's 8B/10B code, so I'll try to write in very general terms. A code such as 8B/10B takes logical data (8 bits per byte) and encodes it into some other form (usually a greater number of bits) in order to achieve some advantage when transferring or storing the data. The advantages we're looking for in P1394 are DC balance and embedded clocking, as well as some increased reliability through detection of "bad" codes. I believe an RLL code (as I think 8B/10B is) also reduces ISI by limiting the range of frequencies on the data transmission signals. In order to provide DC balance on a simple data/clock bus, the number of physical '1' and '0' symbols must be equalized over time. This is usually done by monitoring the disparity, and switching to alternate codes for some logical symbols. 8B/10B does so on a sub-byte basis, since it is actually a 5B/6B and 3B/4B code combined. With DS-links, we have two signal pairs which need DC balance. The key to providing this is to slightly shift the way we look at the D and S signals. As I have said, the D signal carries the data, and the S changes whenever the data does not. An alternate way to view the D signal is that it changes whenever the data changes. The way to achieve DC balance is therefore to equalize the number of changes and "no-changes" in the physical data stream. I haven't proven it yet, but I think a DS-links code can be directly derived from any DC-balanced data/clock code. The chip logic to do so is very simple, or even non-existant. That is, for the DS-links it's a natural. My purpose in describing this here is to start a discssion amongst those more adept at encoding than I. Consider a 10-bit encoding of "1010101010" which is innately DC-balanced. As raw data on DS links, D would carry "1010101010" while S would see "0000000000". Obviously, this is not DC-balanced. However, by converting each '1' to a "change" and each '0' to a "no change" the D signal becomes "1100110011" and the S is "0110011001" which are both DC-balanced. If this principle carries over to all the codes in, say, 8B/10B, the output of the existing encoder could be attached directly to a DS-links transmitter which interprets a '1' as "change the D signal" and a '0' as "change the S signal". As far as embedded clocking, each of the D and S signals carries all the data. The other is only needed to avoid requiring a PLL. If the S signal is removed, the original signal can easily be recovered from the D signal alone. A string of alternating '1' and '0' can be used to lock the PLL at the beginning of transmission. One concern at this point is the effect of such a translation on run length. If we stick with 8B/10B (which has some limitations on the number of '1' or '0' symbols in a row so the PLL can stay locked) we must be sure these limits are not violated by the new code. At this point, I don't think we should rule out new (and patentable!) codes. Whatever we do, however, it must be done *soon*. In any case, I hope this gives the encoding experts out there something to start with. I'll be looking at it myself in more detail over the next few weeks. Please send any findings to the reflector. If we can pull this off, I think we will significantly extend the life of 1394 as a standard in addition to increasing early acceptance through the availability of higher- speed devices. Pete Bartlett NCR Corp. (719) 596-5795 Date: 22 Jun 1992 10:57:43 U From: "David James" <David_James@bbcomm.apple.com> Subject: Re: Comments on Snively and To: "Reflector SerialBus" <P1394@Sun.COM> SUBJECT: RE>Comments on Snively and Gus... Ralph (and other observers), ** In response to your notes, my responding comments are following the **. As an employee of the DoD I am very concerned about SECURITY. The military has to segregate systems so that no task can access data for which it is not explicitly permitted. The military is also required to segregate tasks so that one task CANNOT interfere with another task. ** Within one workstation, I would have a hard time seeing how this ** is a major concern within the commercial environment. However, ** if SCI and/or FC is used to connect workstations, I firmly agree. ** I don't want others accessing and/or modifying data other than ** the address ranges allocated to them. I would expect that this ** would also be a concern of high-end systems providers, who ** use SerialBus to connect to their peripherals. I am very concerned that system interconnect standards provide a means by which one processor or I/O device, or better yet one task, can protect himself from all others. I am somewhat alarmed a Dave's statement that "the chance of a random hit is small". DoD security requirements demand better protection than "chances of random hits. I urge all to very seriously consider ways of limiting access to not only lessen the changes of failure but to make things bulletproof. ** SCI and SerialBus interfaces are unlikely to provide remote systems with complete ** access to workstation memory. Rather, some form of per-page address protection is ** expected. Thus, the OS within one workstation could selectively allow access to ** certain pages while excluding accesses to others. ** ** If desired, only encrypted data could be transferred to these pages, and a ** decoding processor could be interrupted to decrypt and/or authenticate the data ** before its used. Thus, the authentication procedures could behave similarly to ** network protocols. Having a memory-mapped interface to transfer the bulk of data ** without processor-network-software interference (on a per-packet basis) improves ** the efficiency of the data-transfer, while allowing software to perform coding ** and authentication checks before the data is actually used. ** ** Hope to see you in Minneapolis, where we hope to have more of these useful ** FC/SerialBus/SCI discussions. I think that SerialBusnd SCI are forcing changes in ** our thought processes, since they allow us to extend the use of "backplane" ** protocols beyond their normal (<1meter) physical distance limitations. ** ** Dave James ** Apple Computer ** dvj@apple.com Thanks Ralph Lachenmaier Naval Air Warfare Center Warminster, Pa The following is to conclude my action item from the last SerialBus meeting. We modified the priority protocols proposal and confirmed the forward-progress proposal. The proposed resolution of these follows. I modified the priority proposal based on Ed Gardner's inputs. They are simpler, but have fewer priority levels. I believe there was no need to change the basics of the fairness-protocols presentation. However, I have included the minor changes resulting from Jun92 meeting discussions. =========================================================== Title: Proposal for SerialBus priority protocols. Author: Dave James, Apple Computer Status: Modified June 92 meeting. When a transaction is generated, the requester initializes a 1-bit priority bit (lets call this bit pri) in the request-packet header. The pri field remains unchanged as the request packet is routed to the responder. The response packet also has a 1-bit pri field, which is initialized by the responder. The value of the pri field in the response and the request are expected to be the same. For the requester and the intermediate bridges, this field determines the priority of the request packet. For the responder and the returning bridges, this field determines the priority of the response packet. In both cases, the sending node (as opposed to the acknowledging) is (for the sake of this discussion) called the producer. When sending a packet, the producer uses the pri field to determine when the packet may be sent, in terms of the number of fair and priority packets that have already been sent in the fairness interval. A fair packet is one in which pri is 0; a priority packet is one in which pri is 1. A producer may sends only one fair packet in each interval. A producer may send more than one priority packet in each interval, with the constraint that the number of priority packets (Np) is related to the number of previously-send fair packets in the same interval (Nf), as follows: Np <= 3 + 3*(Nf) This ensures that priority packets are able to get most (3/4) of the bandwidth. If a prioritized packet passes through a bridge, and the bridge supports priority, the bridge would behave similarly when forwarding the transaction to an adjacent bus. Bridges are encouraged, not required, to support priority arbitration. Note that setting of the pri bit within packets is done based on task, not node priority. For example, the priority is expected to be set by the I/O driver when a DMA command chain is generated, and could be different for each DMA chain which is processed. Since the location of the pri bit (within a CSR or DMA command-chain) is I/O-unit dependent, it need not visible in any of the node's CSRs. This has been significantly simplified since the last proposal, based on inputs from Ed Gardner. In particular, the multiple priority bits was reduced to 1 (keep it simple), and the dynamically-changing bit (changes after one packet is sent) was eliminated. Title: Forward Progress Protocols. Author: Dave James, Apple Computer Status: Action item from May 92 meeting. The fair-acceptance protocols are critical for memory-mapped resources that are shared and may become heavily loaded. An example is a semaphore location, which has one processor spinning on it (waiting for the semaphore release) and another processor trying to release the semaphore. In the absence of fair-acceptance protocols, the releasing processor might always be busied when its requests are retried (due to conflicts with the other polling processors). Fair-acceptance protocols are also critical for split-response buses, where the detection of transmission errors is based on not receiving the response within a pre-specified interval (as specified by the 1212-defined SPLIT_TIMEOUT register). Without fairness, any timeout value may be exceeded under transient heavy-load conditions. Random retry protocols do not eliminate these false timeouts, they only reduce the frequency at which they occur. On a multiple-bus system, these false timeouts are hard to differentiate from real "hardware-error" timeouts, making recovery from such timeouts a non-trivial task. The typical fair-acceptance solutions proposed by the hardware designers (punt the problem to the I/O driver) doesn't work well in many environments and is not appreciated by the I/O driver writers (who were, after all, promised a reliable memory-mapped programming model). Since I/O driver software has now become the limiting factor in many I/O-product integration schedules, their needs are becoming increasing important. However, forward progress acceptance protocols are not always needed. Forward progress is not a concern when using a simple (normally slave-only) SerialBus power- supply monitor, since that's not likely to be actively shared by multiple processors. Forward progress is needed on memories and general-purpose bridges, since that bounds the value that should be placed in the SPLIT_TIMEOUT register. However, we don't really expect to have many SerialBus memories and many of the initial bridges are likely to connect to higher-bandwidth vendor-specific buses (which may never need to assert busy). Thus, there are likely to be applications that choose to not support forward progress protocols. To avoid compatibility conflicts, the forward-progress protocols should support the used of non-supportive producers and consumers. As background, the forward-progress protocols on SCI (from which the SerialBus protocols were derived) can be bypassed. This supports the sending of new packets (which have not been previously busied) while the oldest previously-busied packet (which has obtained an acceptance reservation in its corresponding consumer node) is being retried. The same kind of "bypass" protocols can be used on SerialBus, by nodes which choose to not support the fair-acceptance protocols. Thus, I would propose that producers support four subaction-send phases. The NOTRY phase would always be used by producers that never support fair-acceptance and/or priority producers that (before being busied) temporarily bypass the fair- acceptance protocols. This and the other producer phases are listed below: NOTRY DOTRY RETRY_A RETRY_B On supportive producers, the NOTRY or DOTRY phase may be used on the first packet transmission. Thereafter, the retry-phase would be the same as that returned in the acknowledge. Supportive nodes would be required to retry each RETRY_A or RETRY_B transmission in every fairness interval. Similarly, the consumer of these subactions returns the following status codes: DOTRY RETRY_A RETRY_B Where the phase indicates the desired producer-phase to be used in the next retransmission. An unsupportive consumer would always return a DOTRY phase. The phase returned by a supportive consumer is defined by the four-state diagram previously presented (SERVE_A, SERVE_NA, etc.). An unsupportive producer always sends its subactions using the NOTRY phase. An unsupportive consumer always acknowledges with a DOTRY phase status. Another point that should be clarified is that any requester/responder producer should keep two fairness bits (fb), one for requests and one for responses, to avoid deadlock. Thus, these nodes may be transmitting and/or retransmitting two packets (one request and one response) during each fairness interval. For each active reservation, the previously-busied packet shall be retries once and only once during each transaction interval. This applies to prioritizes as well as fair transactions. Another question was "how long should a node continue to retry, after being busied". My response was that the answer should be based on wall-clock time (for 10mSec, for example) rather than a count of retries. That seemed to be OK, except it was different than the first planned implementations. Ed Gardner noted that two times might be need; a SPLIT_TIMEOUT indicates how long to wait for responses and a BUSY_TIMEOUT would indicate how long to continue retrying. One option would be to default the BUSY_TIMEOUT to the SPLIT_TIMEOUT value, if the BUSY_TIMEOUT register were not provided. We agreed to continue this discussion on the reflector.