home *** CD-ROM | disk | FTP | other *** search
- +------------------------------------+
- + C1 Protool Specifications +
- +------------------------------------+
-
- Inception
- ---------
- During the summer of 1981, when I frst got the idea of putting up a BBS,
- I started work on a simple protocol for transfering programs to and from the
- BBS. This protocol was similar in structure to XMODEM, and had about the
- same reliability. Under good line conitions, it would give error free
- transfers (this was to be expected). Under moderate noise conditions, the
- protocol would hold up, and would still give error free transmissions. It
- was under poor line conditions that it, and XMODEM, would fall apart.
-
- In the summer of 1984, I started work on a very ambitious project; to
- produce a protocol that was both fast, and extremely reliable, even under the
- worst of line conditions. From this work came the "C1" protocol; not a
- simple block/checksum affair, but a complete communication system for the
- computer.
-
- Be warned, therefore, that understanding the ins and outs of "C1" will
- not be easy, but with enough patience, there's no reason why even the least
- skilled programmer cannot be comfortable with it.
-
-
- Concepts
- --------
- The concept behind the "C1" protocol was simple; to allow two computers
- to "talk with one another (while transferring data) in such a way that
- nothing short of a complete distortion of the transmission line could result
- in a misunderstanding. If this concept could be realized, then files could be
- transferred between computers without fear of line noise causing a breakdown
- in the protocol, or that the received data would differ, in any way, from
- that which was sent.
-
- Nothing is perfect though, and I don't, for a minute, claim that "C1" is
- completely infallible, but can say, with reasonable comfort, that "C1" can
- deliver bad line accuracy not found in any other microcomputer transfer
- protocol. For this accuracy though, there is a price to pay, and it is
- complxity; the protocol is extremely difficult to duplicate without a
- complete and utter understanding of the intricate workings of "C1". This
- document will attempt to give you that required understanding.
-
-
- A Simple Conversaton
- ---------------------
- In First deciding how the protocol would function, I thought of how two
- people could carry on a conversation under high noise conditions, where
- misunderstanding would be the norm. The scenario I'm going to give differs
- from the protocol in that people talking have no way of verifying the
- accuracy of that they believe they have heard. What it is meant to
- demonstrate is how the two computers "talk" with one another, and discuss the
- necessary repetition, or non-repetition, of each block of data (the
- cornerstone of a checksum based transfer protocol).
-
- Ken and John are attempting to assemble a machine in the middle of a
- very noisy macine shop. Ken reads the instructions to John, who carries them
- out. Even at close proximity, the two have difficulty hearing one another,
- so they adopt a form of banter which allows each instruction to be verified
- and acknowledged. Here is how the conversation might go:
-
- John: Put part "A" in hole "D".
-
- Ken: Understood, putting part "A" in hole "D".
-
- John: Acknowledged, let me kow when you are ready for the
- next instruction.
-
- Ken: Go ahed, what do do next?
-
- John: Put screw "E" through slot "T".
-
- Ken: I didn't undertand that, could you please repeat.
-
- John: Oh ok, tell me when you're ready for that instruction
- again.
-
- Ken: Ready now.
-
- The conversation continues on in this fashion, guaranteening that both
- John and Ken are fully aware of what the other is doing. In real life,
- people wouldn't have the patience to keep up that sort of banter, but that's
- why they make more mistakes than a computer.
-
- It is just this sort of "conversation" that the two computers have
- between each other, only the language is diffrent; the instruction is
- replaced by the block of data, and all other statements by special codes.
-
-
- Communication Codes
- -------------------
- One of the areas where simple protocols fall apart is in the
- transmission of "handshaking codes". It's called handshaking because it
- implies that the two computers are having a dialogue, rather than monologue.
- These other protocols rely on single byte (8 bit) words for their
- communication codes, and that could spell trouble, since the likelihood of
- any one 8 bit code being tansposed into another is greater than for multiple
- byte codes. For this reason, "C1" uses 3 byte (24 bit) codes which are
- sufficiently different that the likelihood of a transposition is extremely
- low. Not only that, but as you will soon learn, the method of receiving 3
- byte codes is designed such that if there is sufficient line noise to make
- the necessary transpositions, there would most likely be extra characters
- sent; "C1" can avoid this situation.
-
- Five distinct codes are used in the protocol; "GOO", "BAD", "ACK",
- "S/B", and "SYN". Each has it's own meaning, just like any English word, and
- all are used in a specific sequence such that synchronization difficulties
- would be automatically identified and corrected.
-
-
- Checksums
- ---------
- When a block of data is sent, we must have a way of determining if it is
- correctly received or not. This is accomplished by using what is known as a
- checksum. Quite simply, a checksum is a number which is mathematically
- derived from all the bytes within the block. The receiving computer
- recalculates the sum and compares it with the sum it received along with the
- block. Theoretically, any fault in the transmitted data will result in the
- two checksums not matching; but that's theory. In reality, the accuracy of
- the checksum is based on the type of mathematical operation used to calculate
- it, and what kind of noise it encounters.
-
- The simplest way to create a checksum is to add up all the ASCII values
- of the bytes contained in the block. This is fine for many types of errors,
- but not the type which inverts a particular bit. Should two identical
- inversions occur on two opposite bits, the sum will remain the same. For
- example, take the following two bytes:
-
- 11010011 = 211
- Plus 01101101 = 109
- -------- ---
- 320
- Now assume that the fourth bit from the right of both of these bytes
- becomes inverted by the line noise:
-
- 11011011 = 219
- Plus 01100101 = 101
- -------- ---
- 320
-
- As you an see, the sum remains 320, even though line noise has made
- obvious changes to the bytes. A better system is one called "Cyclic
- redundancy", which works on a somewhat different principle. The checksum is
- 16 bits long, and is created in the following fashion; each byte from the
- block is Exclusive OR'ed with the low order part of the checksum. The
- checksum is then ROTATED one bit to the left, an the procedure repeated with
- the next byte
-
- Even this highly superior method can be ripped up, so I have combined
- BOTH an additive checksum and Cyclic Redundancy checksum to create one very
- hard to beat 32 bit "super" checksum.
-
-
- Listening For Code Words
- ------------------------
- Although 3 byte code words are more reliable than 1 byte code words,
- nothing is perfect. It was once said that if you let an infinte number of
- monkeys bash away at typewriters for an infinite amount of time, one of them
- would eventually type "To be or not to be, that is the question". Although
- this stretches statistical probability to it's limit, this kind of thing can
- easily happen on a smller scale; the letters "GOO" could quite conceivably be
- produced by purely random line noise.
-
- To try and eliminate ALL possible errors isn't feasible, but "C1" makes
- an attempt at trying to eliminate as many as possible. One reasonably
- probable fact is that any noise capable of randomly producing "GO", would not
- stop there; more likely, it would produce a string of characters, something
- like "HGOOEK". Were we to allow the protocol to listen exclusively for three
- letter combinations, it would most assuredly pick out the "GOO" in that
- string.
-
- My specifications for "C1" call for a code recognition routine which
- will ONLY make code word comparisons on the LAST 3 RECEIVED bytes. This is
- accomplished in my coding by going back and testing for further characters
- after I have identified a three byte code word. Should another byte be
- present, the identified code word is thrown away, and the search will
- continue.
-
-
- Statement And Listen Loops
- -------------------------
- One immediate drawback to the system described above is that a REAL code
- word, masked within some random noise, would be rejected by the receiving
- computer. This would also be true of a code word simlpy damaged by noise
- (like "GOE"). For a protocol to be impervious to this sort of corruption, it
- must be capable of restating code words over and over until te receiving
- computer can undrstand, yet it must also have a way of knowing whether the
- receiving computer got the code word or not. This was a fact that eluded me
- when I wrote the original protocol.
-
- When we talk to other people, the cornerstone of understanding is
- recognition. If we ask "What do you think?", yet get no reply, we ask again.
- Only when we recive a reply from the person to whom we are talking do we
- continue on with our next statement. It would be pointless wasting our breath
- on someone who isn't listening.
-
- Within "C1", communication between computers is handled through a
- similar system which I call the "Statement and Listen Loop". It's quite
- simple really; when one computer has to "say" something to the other, it does
- so, then waits for a predetermined time for a known response. Should it fail
- to receive a response within that period of time the code word issaid again,
- and the computer listens for the reply. This continues until the required
- response is heard.
-
- Th system is further enhanced by the fact that both computers are ALWAYS
- engaged in a "Statement and Listen Loop".
-
-
- Synchronizaton Lock
- --------------------
- That rather ominous sounding title is actualy rather simple; it refers
- to a condition whereby the "Statement and Listen Loops" of each computer
- become locked together. This is analogous to two people speaking at the same
- time, over and over, such that no efective communication takes place. In
- order to guarantee that the two computers never get into this state, the wait
- times of the loops are altered slightly.
-
- Assume that the fixed wait loop time was 0.5 seconds; this is called a
- "Short" wait. We also have a "long" wait, whch would be slightly longer, say
- 0.6 seconds (actually, the delay within a "Statement and Listen Loop" is not
- particually critical, but should be somewhere in the neighborhood of one half
- second). Each time the computer goes through an SLL, a counter would
- determine which type of wait to use; Long or Short. The sequence is broken
- into three; the transmitting computer will use a Long-Long-Short, while the
- receiving computer will use a Short-Short-Long.
-
-
- Block Structure
- ---------------
- Each block of data contains somewhat more than just a collection of
- characters taken from a disk, it also contains a "header". The header is 7
- bytes long, and contains the following information:
-
- Byte-1: Low part of ADDITIVE checksum
- Byte-2: High part of ADDITVE checksum
- Byte-3: Low part of CLC checksum
- Byte-4: High prt of CLC checksum
- Byte-5: Size of NEXT blcok
- Byte-6: Low part of Block Number
- Bute-7: High part of Block Number
-
- As you remember from the section on "checksums", there are two
- distinctly different, 16 bit (2 byte) checksums. One is an additive checksum,
- composed of the mathematical sum of the CBMASCII values of all the DATA bytes
- (and bytes 5 though 7 of the header). The other checksum is calculated using
- Cyclic (CLC) Redundancy (on the same bytes). These 32 checksum bits are
- placed in the first 4 bytes of the header.
-
- The 5th byte is the length of the NEXT block. This may seem odd to
- some, but consider the difficulties in sending the size of the current block
- in that self same block. You need to know the block size to calculate the
- checksum, but you can't know for sure that the block size is correct unless
- you have verified the checksum. We call this a Catch-22. By sending the size
- of any given block in the PREVIOUS block, the size is known for a fact BEFORE
- the checksum is calculated.
-
- In the 6th and 7th byte are the block number. This was added quite early
- on in the developement of "C1" under the assumption that it would be
- necessary (as it is in XMODEM). As it turned out, "C1" uses a method of
- handshaking which makes this necessary. None the less, my specifications
- call for it's inclusion, as certain uses of the block number could be made.
- Also, the high order part of the block number (byte 7 of the header) is used
- to flag the lst block.
-
-
- Varying Block Size
- ------------------
- The reason that block size was included in the header was originally to
- allow the last block only to vary in size (one could never guarantee that the
- amount of data to be sent will divide nicely into a preset block size). It
- quickly dawned on me that "C1" was set up in such a way that ANY block size
- could be used for ANY block in the transmission.
-
- Varying block size has it's advantages; under reasonably clean line
- conditions, large blocks transmit the most data with the least handshaking
- (which is mildly time comsuming). Smaller blocks are superior under bad noise
- conditions, since smaller blocks run a higher chance of making it throuh the
- noise unscathed; and should it still fail to make it, less time is required
- to repeat a smaller block.
-
- My current implementation of "C1" allows a user to pick a field
- blocksize between 40 and 255 bytes, but in other implementations, there is no
- reason why block size couldn't be varied DURING transmission to adapt to
- CHANGING line condition.
-
- One final thing concerning block structure is how would one presume to
- know the size of the FIRST BLOCK if that was revealed only in the block that
- came before it (quite a paradox). "C1" requires that the first block contain
- ONLY a header, which would make that block 7 bytes long. This header would
- do little more than supply the receiving computer with the size of first REAL
- block. Accuracy of this first "dummy" block is guaranteed since it still
- must pass the checksum tests. You must make the block number for this dummy
- block "0".
-
-
- Communication Syntax
- --------------------
- Now that you understand block structure, handshaking methods, and code
- word vocabulary, it comes time to find out how this all comes together.
-
- Most protocols have very simple handshaking between blocks which is easy
- to trip up, given sufficiently noisy conditions. Usually, the transmitting
- computer sends the block, then waits for a response from the receiving
- computer; either "good" or "bad". The transmitting computer then proceeds to
- send the next block (if "good") or resend the last block (if "bad"). This
- system falls apart the moment the transmitting computer receives a false
- indiction of "good" or "bad" and goes on to trasmit the wrong block (and
- whether the receiving computer likes it or not, it has to tackle with another
- block). Should things get out of sync, and the transmitting computer sends
- the next block when it should have sent the last one again, XMODEM attempts
- to make corrections by use of the block number encoded within each block.
-
- "C1" does nothing so cude; it's very communication syntax guarantees that
- neither computer will get out of phase with the other. Whereas XMODEM uses a
- single statement monolog between each block, "C1" uses a multiple par dialog.
- This makes "C1" about 3% slower than Xmodem, but this small trade-off in
- speed for accuracy will be well worth it the first time you run into trouble
- with XMODEM.
-
- XMODEM communictions would look soething like this:
-
- Xmit: Transmits Block
-
- Rec : "Good"
-
- Xmit: Transmits Next Block
-
- Rec : "Bad"
-
- Xmit: Transmits Same Block Again
-
- In "C1", the transmission would look sonething like this:
-
- Xmit: Transmits Block
-
- Rec : "Good"
-
- Xmit: Good Block acknowledge
-
- Rec : Send next block for me
-
- Xmit: Transmits Next Block
-
- Rec : "Bad"
-
- Xmit: Bad block acknowledged
-
- Re : Send that block again
-
- Xmit: Transmit Same Block Again
-
- In this type of transmision dialog, neither computer can get out of
- sync, since should it receive the opposite response than it expects, it goes
- back to give the correct code word for the reponse it DID RECEIVE, thus
- regaining proper synchronization. Couple this with the "Statement and Listen
- Loops", and you can readily see that communication would be hard to break
- down.
-
-
- Syntax Description
- ------------------
- The following diagram should give you an understanding of the flow of
- information between blocks:
-
- For a Good Block:
-
- Xmit: [Block] "ACK [Next Block]
-
- Rec : "GOO" "S/B"
-
- For a Bad Block:
-
- Xmit: [Block "ACK" [Same Block]
-
- Rec : "BAD" "S/B"
-
- Actally, the two are identical; the only difference is the substitution
- of either "GOO" or "BAD" as the response to the recived block.
-
- Immediately after receiving the block, the receiving computer
- recalculates the checksum to determine validity of the data. In the
- meantime, the transmitting computer starts to wait for a "GOO" or "BAD"
- signal. Since it can "say" nothing until it receives one of these codes, it
- merely waits. That may sound suspiciously like a good place to "hang up" the
- protocol, but the receiving end is eventually going to finish receiving the
- block, either because it timed out waiting, or it finished collecting the
- correct number of bytes from the transmitting computer.
-
- At that time, the receiving computer sends the appropriate code word
- ("GOO" or "BAD") and begins to wait for an acknowledgement ("ACK"). If it
- doesn't receive the "ACK" in about one half second, it sends the "GOO"
- or"BAD" code word once again. Meanwhile, the transmitting computer has been
- patiently awaiting the reception of the "GOO" or "BAD" code. Once it
- receives it, it transmits an "ACK" and starts to wait for an "send block"
- signal ("S/B"). If it doesn't get the "S/B" within about one half second, it
- sends "ACK" again.
-
- Back at the receiving computer, which is waiting for this "ACK" signal,
- it receives it and sends the "S/B" signal and begins to wait for the block.
- Should it receive an "ACK" while waiting for the block, or receives nothing
- at all for 5 seconds, it assumes that the transmitting computer hasn't heard
- the "S/B" and transmits it again. In the meantime, the transmtting computer
- is waiting for the "S/B", and upon reception, starts sening the block. The
- process has now started all over again.
-
- A quick analysis of this system will reveal that it's damned near
- impossible to get any type of noise which could possibly mimmick the code
- sequences required. Also, no noise could stop the eventual completion of the
- above squence, since each computer is always sending and waiting". If two
- people keep repeating their sentences over and over, and continue to listen
- to the other person, even a noisy room couldn't stop them from hearing one
- anoter EVENTUALLY.
-
- Of course, some line noise is just so horrendous, that even this method
- of communication could fail. Then again, this type of noise would make it
- damned near impossible for the user to be online in the first place, so it
- can be considered an unlikely event. But, should one of the computers go
- off-line for any reason, we wouldn't want the other computer to keep looping
- and looping until it dies of old age. Although I havn't built in such
- protection into the terminal program I distribute in the public domain, my
- BBS program does have abortion code. Should the protocol on the BBS have to
- go throuh the "Statement and Listen Loop" more than 12 times in a row (which
- is highly unlikely if the other computer is still online), it will abort the
- transfer. Similar code should be used in your implementation.
-
-
- The End-Off Situation
- ---------------------
- When the final block is transmitted, the high order part of the block
- number should be made HEX "FF" (255 decimal). This will inform the receiving
- computer that this is the last block of data, and to expect no more. The
- question now arises; how can both computers be 100% sure that the other is
- fully aware of the file completion? A fair question, but not one with a
- simple answer.
-
- When the transmitting computer receives the "GOO" for the last block, it
- can be fairly certain that the receiving computer has received the final
- block, but it must inform the receiving computer tha it knows this. It does
- so by sending an "ACK", but cannot be sure the receiving computer has
- received the "ACK" unless it gets the "S/B" signal back. Now, the
- transmitting computer must acknowledge the reception of the "S/B", but under
- the normal communications syntax, it would now have send a block.
-
- This is where the "End-Off" syntax comes into play; after receiving the
- "S/B", the transmitting computer sends back a "SYN" signal. In response to
- that receiving computer sends it's own "S/B" signal, then waits for the fina
- "S/B" from the transmitting computer. Since it will not be responding to this
- code, it simply goes into a wait cyle for approximately 5 seconds. If it
- does get the "S/B" within that 5 seconds, it ends immediately, but otherwise
- doesn't really care if it receives the code or not since at this stage, there
- is 100% assurance of both computers knowing things are Ok.
-
- The transmitting computer need only send copies of the "S/B" code at
- this point, since, as stated above, there is full assurance that both
- computers are finished. NOTE that the code chosen for the End-Off situation
- are not necessarily related to their apparent function.
-
-
- Transferring File Type
- ----------------------
- When transferring files from one computer to another it is often
- necessary to also transfer the file type, but this must be known BEFORE the
- file is opened, and, therefore, before the protocol begins. "C1" does not
- impose any strick rules on what sort of information you trnsfer about the
- files, if any, but when writing a terminal program to communicate with one of
- my bulletin boards, the following should be done:
-
- Using a full implementation of the "C1" protocol (first dummy block,
- data block, and End-Off), transmit a single byte of data corresponding to the
- folowing file types:
-
- 1 = Program File
- 2 = SEQ file
- 3 = WordPro File
-
- Transmitting this single piece of data would require that TWO blocks be
- sent; the initial dummy block to set up the size of the first data block (of
- which there will be only one, size 8), and the data block itself, consistig
- of 7 header bytes and the single file type byte.
-
- For other applications, one could conceivable transfer much more
- information, including file name, file type, computer type, etc. It could
- even be possible to transfer multiple files, specifying the number and name
- of each file in this first tranmission. Alternately, no one said you HAVE to
- use this first separate transmission; if no information other than the file
- needs to be transmitted, you just send the file and nothing more.
-
- -----------------------------------------------------------------
- NOTE:
- For more inforation contact:
-
- Steve Punter
- 442 Forest Fire Lane
- Mississauga, Ontario
- Canada L4W 3P4
-
- Telephone (416) 896-1446
-
- Business Hours; voice only:
- 12AM - 5PM, Monday to Friday
-
- Bulletin Board Hours; modem only:
- 6PM - 10AM, Monday to Friday
- 24 hours Saturday and Sunday
-
- -----------------------------------------------------------------
- Steve Punter has release the source code to Public Domain!
-
- George Murton - Sysop
- The KEYSTONE C-64 BBS (c)
- PunterNet Node #26
- (215) 770-0774
- 24 hours everyday
- 300/1200 BPS
- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
-
-
-
-
-
-
-
-
-
-
-
-
-