home *** CD-ROM | disk | FTP | other *** search
- ==Phrack Inc.==
-
- Volume Three, Issue 29, File #3 of 12
-
- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
- <> <>
- <> Introduction to the Internet Protocols <>
- <> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <>
- <> Chapter Nine Of The Future Transcendent Saga <>
- <> <>
- <> Part Two of Two Files <>
- <> <>
- <> Presented by Knight Lightning <>
- <> September 27, 1989 <>
- <> <>
- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
-
-
- Prologue - Part Two
- ~~~~~~~~
- A great deal of the material in this file comes from "Introduction to the
- Internet Protocols" by Charles L. Hedrick of Rutgers University. That material
- is copyrighted and is used in this file by permission. Time differention and
- changes in the wide area networks have made it neccessary for some details of
- the file to updated and in some cases reworded for better understanding by our
- readers. Also, Unix is a trademark of AT&T Technologies, Inc. -- Again, just
- thought I'd let you know.
-
- Table of Contents - Part Two
- ~~~~~~~~~~~~~~~~~
- * Introduction - Part Two
- * Well Known Sockets And The Applications Layer
- * Protocols Other Than TCP: UDP and ICMP
- * Keeping Track Of Names And Information: The Domain System
- * Routing
- * Details About The Internet Addresses: Subnets And Broadcasting
- * Datagram Fragmentation And Reassembly
- * Ethernet Encapsulation: ARP
- * Getting More Information
-
-
- Introduction - Part Two
- ~~~~~~~~~~~~
- This article is a brief introduction to TCP/IP, followed by suggestions on
- what to read for more information. This is not intended to be a complete
- description, but it can give you a reasonable idea of the capabilities of the
- protocols. However, if you need to know any details of the technology, you
- will want to read the standards yourself.
-
- Throughout this file, you will find references to the standards, in the form of
- "RFC" (Request For Comments) or "IEN" (Internet Engineering Notes) numbers --
- these are document numbers. The final section (Getting More Information)
- explains how you can get copies of those standards.
-
-
- Well-Known Sockets And The Applications Layer
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- In part one of this series, I described how a stream of data is broken up into
- datagrams, sent to another computer, and put back together. However something
- more is needed in order to accomplish anything useful. There has to be a way
- for you to open a connection to a specified computer, log into it, tell it what
- file you want, and control the transmission of the file. (If you have a
- different application in mind, e.g. computer mail, some analogous protocol is
- needed.) This is done by "application protocols." The application protocols
- run "on top" of TCP/IP. That is, when they want to send a message, they give
- the message to TCP. TCP makes sure it gets delivered to the other end.
- Because TCP and IP take care of all the networking details, the applications
- protocols can treat a network connection as if it were a simple byte stream,
- like a terminal or phone line.
-
- Before going into more details about applications programs, we have to describe
- how you find an application. Suppose you want to send a file to a computer
- whose Internet address is 128.6.4.7. To start the process, you need more than
- just the Internet address. You have to connect to the FTP server at the other
- end. In general, network programs are specialized for a specific set of tasks.
- Most systems have separate programs to handle file transfers, remote terminal
- logins, mail, etc. When you connect to 128.6.4.7, you have to specify that you
- want to talk to the FTP server. This is done by having "well-known sockets"
- for each server. Recall that TCP uses port numbers to keep track of individual
- conversations. User programs normally use more or less random port numbers.
- However specific port numbers are assigned to the programs that sit waiting for
- requests. For example, if you want to send a file, you will start a program
- called "ftp." It will open a connection using some random number, say 1234,
- for the port number on its end. However it will specify port number 21 for the
- other end. This is the official port number for the FTP server. Note that
- there are two different programs involved. You run ftp on your side. This is
- a program designed to accept commands from your terminal and pass them on to
- the other end. The program that you talk to on the other machine is the FTP
- server. It is designed to accept commands from the network connection, rather
- than an interactive terminal. There is no need for your program to use a
- well-known socket number for itself. Nobody is trying to find it. However the
- servers have to have well-known numbers, so that people can open connections to
- them and start sending them commands. The official port numbers for each
- program are given in "Assigned Numbers."
-
- Note that a connection is actually described by a set of 4 numbers: The
- Internet address at each end, and the TCP port number at each end. Every
- datagram has all four of those numbers in it. (The Internet addresses are in
- the IP header, and the TCP port numbers are in the TCP header.) In order to
- keep things straight, no two connections can have the same set of numbers.
- However it is enough for any one number to be different. For example, it is
- perfectly possible for two different users on a machine to be sending files to
- the same other machine. This could result in connections with the following
- parameters:
-
- Internet addresses TCP ports
- connection 1 128.6.4.194, 128.6.4.7 1234, 21
- connection 2 128.6.4.194, 128.6.4.7 1235, 21
-
- Since the same machines are involved, the Internet addresses are the same.
- Since they are both doing file transfers, one end of the connection involves
- the well-known port number for FTP. The only thing that differs is the port
- number for the program that the users are running. That's enough of a
- difference. Generally, at least one end of the connection asks the network
- software to assign it a port number that is guaranteed to be unique. Normally,
- it's the user's end, since the server has to use a well-known number.
-
- Now that we know how to open connections, let's get back to the applications
- programs. As mentioned earlier, once TCP has opened a connection, we have
- something that might as well be a simple wire. All the hard parts are handled
- by TCP and IP. However we still need some agreement as to what we send over
- this connection. In effect this is simply an agreement on what set of commands
- the application will understand, and the format in which they are to be sent.
- Generally, what is sent is a combination of commands and data. They use
- context to differentiate. For example, the mail protocol works like this:
- Your mail program opens a connection to the mail server at the other end. Your
- program gives it your machine's name, the sender of the message, and the
- recipients you want it sent to. It then sends a command saying that it is
- starting the message. At that point, the other end stops treating what it sees
- as commands, and starts accepting the message. Your end then starts sending
- the text of the message. At the end of the message, a special mark is sent (a
- dot in the first column). After that, both ends understand that your program
- is again sending commands. This is the simplest way to do things, and the one
- that most applications use.
-
- File transfer is somewhat more complex. The file transfer protocol involves
- two different connections. It starts out just like mail. The user's program
- sends commands like "log me in as this user," "here is my password," "send me
- the file with this name." However once the command to send data is sent, a
- second connection is opened for the data itself. It would certainly be
- possible to send the data on the same connection, as mail does. However file
- transfers often take a long time. The designers of the file transfer protocol
- wanted to allow the user to continue issuing commands while the transfer is
- going on. For example, the user might make an inquiry, or he might abort the
- transfer. Thus the designers felt it was best to use a separate connection for
- the data and leave the original command connection for commands. (It is also
- possible to open command connections to two different computers, and tell them
- to send a file from one to the other. In that case, the data couldn't go over
- the command connection.)
-
- Remote terminal connections use another mechanism still. For remote logins,
- there is just one connection. It normally sends data. When it is necessary to
- send a command (e.g. to set the terminal type or to change some mode), a
- special character is used to indicate that the next character is a command. If
- the user happens to type that special character as data, two of them are sent.
-
- I am not going to describe the application protocols in detail in this file.
- It is better to read the RFCs yourself. However there are a couple of common
- conventions used by applications that will be described here. First, the
- common network representation: TCP/IP is intended to be usable on any
- computer. Unfortunately, not all computers agree on how data is represented.
-
- There are differences in character codes (ASCII vs. EBCDIC), in end of line
- conventions (carriage return, line feed, or a representation using counts), and
- in whether terminals expect characters to be sent individually or a line at a
- time. In order to allow computers of different kinds to communicate, each
- applications protocol defines a standard representation. Note that TCP and IP
- do not care about the representation. TCP simply sends octets. However the
- programs at both ends have to agree on how the octets are to be interpreted.
-
- The RFC for each application specifies the standard representation for that
- application. Normally it is "net ASCII." This uses ASCII characters, with end
- of line denoted by a carriage return followed by a line feed. For remote
- login, there is also a definition of a "standard terminal," which turns out to
- be a half-duplex terminal with echoing happening on the local machine. Most
- applications also make provisions for the two computers to agree on other
- representations that they may find more convenient. For example, PDP-10's have
- 36-bit words. There is a way that two PDP-10's can agree to send a 36-bit
- binary file. Similarly, two systems that prefer full-duplex terminal
- conversations can agree on that. However each application has a standard
- representation, which every machine must support.
-
- So that you might get a better idea of what is involved in the application
- protocols, here is an imaginary example of SMTP (the simple mail transfer
- protocol.) Assume that a computer called FTS.PHRACK.EDU wants to send the
- following message.
-
- Date: Fri, 17 Nov 89 15:42:06 EDT
- From: knight@fts.phrack.edu
- To: taran@msp.phrack.edu
- Subject: Anniversary
-
- Four years is quite a long time to be around. Happy Anniversary!
-
- Note that the format of the message itself is described by an Internet standard
- (RFC 822). The standard specifies the fact that the message must be
- transmitted as net ASCII (i.e. it must be ASCII, with carriage return/linefeed
- to delimit lines). It also describes the general structure, as a group of
- header lines, then a blank line, and then the body of the message. Finally, it
- describes the syntax of the header lines in detail. Generally they consist of
- a keyword and then a value.
-
- Note that the addressee is indicated as TARAN@MSP.PHRACK.EDU. Initially,
- addresses were simply "person at machine." Today's standards are much more
- flexible. There are now provisions for systems to handle other systems' mail.
- This can allow automatic forwarding on behalf of computers not connected to the
- Internet. It can be used to direct mail for a number of systems to one central
- mail server. Indeed there is no requirement that an actual computer by the
- name of FTS.PHRACK.EDU even exist (and it doesn't). The name servers could be
- set up so that you mail to department names, and each department's mail is
- routed automatically to an appropriate computer. It is also possible that the
- part before the @ is something other than a user name. It is possible for
- programs to be set up to process mail. There are also provisions to handle
- mailing lists, and generic names such as "postmaster" or "operator."
-
- The way the message is to be sent to another system is described by RFCs 821
- and 974. The program that is going to be doing the sending asks the name
- server several queries to determine where to route the message. The first
- query is to find out which machines handle mail for the name FTS.PHRACK.EDU.
- In this case, the server replies that FTS.PHRACK.EDU handles its own mail. The
- program then asks for the address of FTS.PHRACK.EDU, which for the sake of this
- example is is 269.517.724.5. Then the the mail program opens a TCP connection
- to port 25 on 269.517.724.5. Port 25 is the well-known socket used for
- receiving mail. Once this connection is established, the mail program starts
- sending commands. Here is a typical conversation. Each line is labelled as to
- whether it is from FTS or MSP. Note that FTS initiated the connection:
-
- MSP 220 MSP.PHRACK.EDU SMTP Service at 17 Nov 89 09:35:24 EDT
- FTS HELO fts.phrack.edu
- MSP 250 MSP.PHRACK.EDU - Hello, FTS.PHRACK.EDU
- FTS MAIL From:<knight@fts.phrack.edu>
- MSP 250 MAIL accepted
- FTS RCPT To:<taran@msp.phrack.edu>
- MSP 250 Recipient accepted
- FTS DATA
- MSP 354 Start mail input; end with <CRLF>.<CRLF>
- FTS Date: Fri, 17 Nov 89 15:42:06 EDT
- FTS From: knight@fts.phrack.edu
- FTS To: taran@msp.phrack.edu
- FTS Subject: Anniversary
- FTS
- FTS Four years is quite a long time to be around. Happy Anniversary!
- FTS .
- MSP 250 OK
- FTS QUIT
- MSP 221 MSP.PHRACK.EDU Service closing transmission channel
-
- The commands all use normal text. This is typical of the Internet standards.
- Many of the protocols use standard ASCII commands. This makes it easy to watch
- what is going on and to diagnose problems. The mail program keeps a log of
- each conversation so if something goes wrong, the log file can simply be mailed
- to the postmaster. Since it is normal text, he can see what was going on. It
- also allows a human to interact directly with the mail server, for testing.
-
- The responses all begin with numbers. This is also typical of Internet
- protocols. The allowable responses are defined in the protocol. The numbers
- allow the user program to respond unambiguously. The rest of the response is
- text, which is normally for use by any human who may be watching or looking at
- a log. It has no effect on the operation of the programs. The commands
- themselves simply allow the mail program on one end to tell the mail server the
- information it needs to know in order to deliver the message. In this case,
- the mail server could get the information by looking at the message itself.
-
- Every session must begin with a HELO, which gives the name of the system that
- initiated the connection. Then the sender and recipients are specified. There
- can be more than one RCPT command, if there are several recipients. Finally
- the data itself is sent. Note that the text of the message is terminated by a
- line containing just a period, but if such a line appears in the message, the
- period is doubled. After the message is accepted, the sender can send another
- message, or terminate the session as in the example above.
-
- Generally, there is a pattern to the response numbers. The protocol defines
- the specific set of responses that can be sent as answers to any given command.
- However programs that don't want to analyze them in detail can just look at the
- first digit. In general, responses that begin with a 2 indicate success.
- Those that begin with 3 indicate that some further action is needed, as shown
- above. 4 and 5 indicate errors. 4 is a "temporary" error, such as a disk
- filling. The message should be saved, and tried again later. 5 is a permanent
- error, such as a non-existent recipient. The message should be returned to the
- sender with an error message.
-
- For more details about the protocols mentioned in this section, see RFCs
- 821/822 for mail, RFC 959 for file transfer, and RFCs 854/855 for remote
- logins. For the well-known port numbers, see the current edition of Assigned
- Numbers, and possibly RFC 814.
-
-
- Protocols Other Than TCP: UDP and ICMP
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Thus far only connections that use TCP have been described. Remember that TCP
- is responsible for breaking up messages into datagrams, and reassembling them
- properly. However in many applications, there are messages that will always
- fit in a single datagram. An example is name lookup. When a user attempts to
- make a connection to another system, he will generally specify the system by
- name, rather than Internet address. His system has to translate that name to
- an address before it can do anything. Generally, only a few systems have the
- database used to translate names to addresses. So the user's system will want
- to send a query to one of the systems that has the database.
-
- This query is going to be very short. It will certainly fit in one datagram.
- So will the answer. Thus it seems silly to use TCP. Of course TCP does more
- than just break things up into datagrams. It also makes sure that the data
- arrives, resending datagrams where necessary. But for a question that fits in
- a single datagram, all of the complexity of TCP is not needed. If there is not
- an answer after a few seconds, you can just ask again. For applications like
- this, there are alternatives to TCP.
-
- The most common alternative is UDP ("user datagram protocol"). UDP is designed
- for applications where you don't need to put sequences of datagrams together.
- It fits into the system much like TCP. There is a UDP header. The network
- software puts the UDP header on the front of your data, just as it would put a
- TCP header on the front of your data. Then UDP sends the data to IP, which
- adds the IP header, putting UDP's protocol number in the protocol field instead
- of TCP's protocol number.
-
- UDP doesn't do as much as TCP does. It does not split data into multiple
- datagrams and it does not keep track of what it has sent so it can resend if
- necessary. About all that UDP provides is port numbers so that several
- programs can use UDP at once. UDP port numbers are used just like TCP port
- numbers. There are well-known port numbers for servers that use UDP.
-
- The UDP header is shorter than a TCP header. It still has source and
- destination port numbers, and a checksum, but that's about it. UDP is used by
- the protocols that handle name lookups (see IEN 116, RFC 882, and RFC 883) and
- a number of similar protocols.
-
- Another alternative protocol is ICMP ("Internet control message protocol").
- ICMP is used for error messages, and other messages intended for the TCP/IP
- software itself, rather than any particular user program. For example, if you
- attempt to connect to a host, your system may get back an ICMP message saying
- "host unreachable." ICMP can also be used to find out some information about
- the network. See RFC 792 for details of ICMP.
-
- ICMP is similar to UDP, in that it handles messages that fit in one datagram.
- However it is even simpler than UDP. It does not even have port numbers in its
- header. Since all ICMP messages are interpreted by the network software
- itself, no port numbers are needed to say where an ICMP message is supposed to
- go.
-
-
- Keeping Track Of Names And Information: The Domain System
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- As we indicated earlier, the network software generally needs a 32-bit Internet
- address in order to open a connection or send a datagram. However users prefer
- to deal with computer names rather than numbers. Thus there is a database that
- allows the software to look up a name and find the corresponding number.
-
- When the Internet was small, this was easy. Each system would have a file that
- listed all of the other systems, giving both their name and number. There are
- now too many computers for this approach to be practical. Thus these files
- have been replaced by a set of name servers that keep track of host names and
- the corresponding Internet addresses. (In fact these servers are somewhat more
- general than that. This is just one kind of information stored in the domain
- system.) A set of interlocking servers are used rather than a single central
- one.
-
- There are now so many different institutions connected to the Internet that it
- would be impractical for them to notify a central authority whenever they
- installed or moved a computer. Thus naming authority is delegated to
- individual institutions. The name servers form a tree, corresponding to
- institutional structure. The names themselves follow a similar structure. A
- typical example is the name BORAX.LCS.MIT.EDU. This is a computer at the
- Laboratory for Computer Science (LCS) at MIT. In order to find its Internet
- address, you might potentially have to consult 4 different servers.
-
- First, you would ask a central server (called the root) where the EDU server
- is. EDU is a server that keeps track of educational institutions. The root
- server would give you the names and Internet addresses of several servers for
- EDU. You would then ask EDU where the server for MIT is. It would give you
- names and Internet addresses of several servers for MIT. Then you would ask
- MIT where the server for LCS is, and finally you would ask one of the LCS
- servers about BORAX. The final result would be the Internet address for
- BORAX.LCS.MIT.EDU. Each of these levels is referred to as a "domain." The
- entire name, BORAX.LCS.MIT.EDU, is called a "domain name." (So are the names
- of the higher-level domains, such as LCS.MIT.EDU, MIT.EDU, and EDU.)
-
- Fortunately, you don't really have to go through all of this most of the time.
- First of all, the root name servers also happen to be the name servers for the
- top-level domains such as EDU. Thus a single query to a root server will get
- you to MIT. Second, software generally remembers answers that it got before.
- So once we look up a name at LCS.MIT.EDU, our software remembers where to find
- servers for LCS.MIT.EDU, MIT.EDU, and EDU. It also remembers the translation
- of BORAX.LCS.MIT.EDU. Each of these pieces of information has a "time to live"
- associated with it. Typically this is a few days. After that, the information
- expires and has to be looked up again. This allows institutions to change
- things.
-
- The domain system is not limited to finding out Internet addresses. Each
- domain name is a node in a database. The node can have records that define a
- number of different properties. Examples are Internet address, computer type,
- and a list of services provided by a computer. A program can ask for a
- specific piece of information, or all information about a given name. It is
- possible for a node in the database to be marked as an "alias" (or nickname)
- for another node. It is also possible to use the domain system to store
- information about users, mailing lists, or other objects.
-
- There is an Internet standard defining the operation of these databases as well
- as the protocols used to make queries of them. Every network utility has to be
- able to make such queries since this is now the official way to evaluate host
- names. Generally utilities will talk to a server on their own system. This
- server will take care of contacting the other servers for them. This keeps
- down the amount of code that has to be in each application program.
-
- The domain system is particularly important for handling computer mail. There
- are entry types to define what computer handles mail for a given name to
- specify where an individual is to receive mail and to define mailing lists.
-
- See RFCs 882, 883, and 973 for specifications of the domain system. RFC 974
- defines the use of the domain system in sending mail.
-
- Routing
- ~~~~~~~
- The task of finding how to get a datagram to its destination is referred to as
- "routing." Many of the details depend upon the particular implementation.
- However some general things can be said.
-
- It is necessary to understand the model on which IP is based. IP assumes that
- a system is attached to some local network. It is assumed that the system can
- send datagrams to any other system on its own network. (In the case of
- Ethernet, it simply finds the Ethernet address of the destination system, and
- puts the datagram out on the Ethernet.) The problem comes when a system is
- asked to send a datagram to a system on a different network. This problem is
- handled by gateways.
-
- A gateway is a system that connects a network with one or more other networks.
- Gateways are often normal computers that happen to have more than one network
- interface. The software on a machine must be set up so that it will forward
- datagrams from one network to the other. That is, if a machine on network
- 128.6.4 sends a datagram to the gateway, and the datagram is addressed to a
- machine on network 128.6.3, the gateway will forward the datagram to the
- destination. Major communications centers often have gateways that connect a
- number of different networks.
-
- Routing in IP is based entirely upon the network number of the destination
- address. Each computer has a table of network numbers. For each network
- number, a gateway is listed. This is the gateway to be used to get to that
- network. The gateway does not have to connect directly to the network, it just
- has to be the best place to go to get there.
-
- When a computer wants to send a datagram, it first checks to see if the
- destination address is on the system's own local network. If so, the datagram
- can be sent directly. Otherwise, the system expects to find an entry for the
- network that the destination address is on. The datagram is sent to the
- gateway listed in that entry. This table can get quite big. For example, the
- Internet now includes several hundred individual networks. Thus various
- strategies have been developed to reduce the size of the routing table. One
- strategy is to depend upon "default routes." There is often only one gateway
- out of a network.
-
- This gateway might connect a local Ethernet to a campus-wide backbone network.
- In that case, it is not neccessary to have a separate entry for every network
- in the world. That gateway is simply defined as a "default." When no specific
- route is found for a datagram, the datagram is sent to the default gateway. A
- default gateway can even be used when there are several gateways on a network.
- There are provisions for gateways to send a message saying "I'm not the best
- gateway -- use this one instead." (The message is sent via ICMP. See RFC
- 792.) Most network software is designed to use these messages to add entries
- to their routing tables. Suppose network 128.6.4 has two gateways, 128.6.4.59
- and 128.6.4.1. 128.6.4.59 leads to several other internal Rutgers networks.
- 128.6.4.1 leads indirectly to the NSFnet. Suppose 128.6.4.59 is set as a
- default gateway, and there are no other routing table entries. Now what
- happens when you need to send a datagram to MIT? MIT is network 18. Since
- there is no entry for network 18, the datagram will be sent to the default,
- 128.6.4.59. This gateway is the wrong one. So it will forward the datagram to
- 128.6.4.1. It will also send back an error saying in effect: "to get to
- network 18, use 128.6.4.1." The software will then add an entry to the routing
- table. Any future datagrams to MIT will then go directly to 128.6.4.1. (The
- error message is sent using the ICMP protocol. The message type is called
- "ICMP redirect.")
-
- Most IP experts recommend that individual computers should not try to keep
- track of the entire network. Instead, they should start with default gateways
- and let the gateways tell them the routes as just described. However this
- doesn't say how the gateways should find out about the routes. The gateways
- can't depend upon this strategy. They have to have fairly complete routing
- tables. For this, some sort of routing protocol is needed. A routing protocol
- is simply a technique for the gateways to find each other and keep up to date
- about the best way to get to every network. RFC 1009 contains a review of
- gateway design and routing.
-
-
- Details About Internet Addresses: Subnets And Broadcasting
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- Internet addresses are 32-bit numbers, normally written as 4 octets (in
- decimal), e.g. 128.6.4.7. There are actually 3 different types of address.
- The problem is that the address has to indicate both the network and the host
- within the network. It was felt that eventually there would be lots of
- networks. Many of them would be small, but probably 24 bits would be needed to
- represent all the IP networks. It was also felt that some very big networks
- might need 24 bits to represent all of their hosts. This would seem to lead to
- 48 bit addresses. But the designers really wanted to use 32 bit addresses. So
- they adopted a kludge. The assumption is that most of the networks will be
- small. So they set up three different ranges of address.
-
- Addresses beginning with 1 to 126 use only the first octet for the network
- number. The other three octets are available for the host number. Thus 24
- bits are available for hosts. These numbers are used for large networks, but
- there can only be 126 of these. The ARPAnet is one and there are a few large
- commercial networks. But few normal organizations get one of these "class A"
- addresses.
-
- For normal large organizations, "class B" addresses are used. Class B
- addresses use the first two octets for the network number. Thus network
- numbers are 128.1 through 191.254. (0 and 255 are avoided for reasons to be
- explained below. Addresses beginning with 127 are also avoided because they
- are used by some systems for special purposes.) The last two octets are
- available for host addesses, giving 16 bits of host address. This allows for
- 64516 computers, which should be enough for most organizations. Finally, class
- C addresses use three octets in the range 192.1.1 to 223.254.254. These allow
- only 254 hosts on each network, but there can be lots of these networks.
- Addresses above 223 are reserved for future use as class D and E (which are
- currently not defined).
-
- 0 and 255 have special meanings. 0 is reserved for machines that do not know
- their address. In certain circumstances it is possible for a machine not to
- know the number of the network it is on, or even its own host address. For
- example, 0.0.0.23 would be a machine that knew it was host number 23, but
- didn't know on what network.
-
- 255 is used for "broadcast." A broadcast is a message that you want every
- system on the network to see. Broadcasts are used in some situations where you
- don't know who to talk to. For example, suppose you need to look up a host
- name and get its Internet address. Sometimes you don't know the address of the
- nearest name server. In that case, you might send the request as a broadcast.
- There are also cases where a number of systems are interested in information.
- It is then less expensive to send a single broadcast than to send datagrams
- individually to each host that is interested in the information. In order to
- send a broadcast, you use an address that is made by using your network
- address, with all ones in the part of the address where the host number goes.
- For example, if you are on network 128.6.4, you would use 128.6.4.255 for
- broadcasts. How this is actually implemented depends upon the medium. It is
- not possible to send broadcasts on the ARPAnet, or on point to point lines, but
- it is possible on an Ethernet. If you use an Ethernet address with all its
- bits on (all ones), every machine on the Ethernet is supposed to look at that
- datagram.
-
- Because 0 and 255 are used for unknown and broadcast addresses, normal hosts
- should never be given addresses containing 0 or 255. Addresses should never
- begin with 0, 127, or any number above 223.
-
-
- Datagram Fragmentation And Reassembly
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- TCP/IP is designed for use with many different kinds of networks.
- Unfortunately, network designers do not agree about how big packets can be.
- Ethernet packets can be 1500 octets long. ARPAnet packets have a maximum of
- around 1000 octets. Some very fast networks have much larger packet sizes.
- You might think that IP should simply settle on the smallest possible size, but
- this would cause serious performance problems. When transferring large files,
- big packets are far more efficient than small ones. So it is best to be able
- to use the largest packet size possible, but it is also necessary to be able to
- handle networks with small limits. There are two provisions for this.
-
- TCP has the ability to "negotiate" about datagram size. When a TCP connection
- first opens, both ends can send the maximum datagram size they can handle. The
- smaller of these numbers is used for the rest of the connection. This allows
- two implementations that can handle big datagrams to use them, but also lets
- them talk to implementations that cannot handle them. This does not completely
- solve the problem. The most serious problem is that the two ends do not
- necessarily know about all of the steps in between. For this reason, there are
- provisions to split datagrams up into pieces. This is referred to as
- "fragmentation."
-
- The IP header contains fields indicating that a datagram has been split and
- enough information to let the pieces be put back together. If a gateway
- connects an Ethernet to the Arpanet, it must be prepared to take 1500-octet
- Ethernet packets and split them into pieces that will fit on the Arpanet.
- Furthermore, every host implementation of TCP/IP must be prepared to accept
- pieces and put them back together. This is referred to as "reassembly."
-
- TCP/IP implementations differ in the approach they take to deciding on datagram
- size. It is fairly common for implementations to use 576-byte datagrams
- whenever they can't verify that the entire path is able to handle larger
- packets. This rather conservative strategy is used because of the number of
- implementations with bugs in the code to reassemble fragments. Implementors
- often try to avoid ever having fragmentation occur. Different implementors
- take different approaches to deciding when it is safe to use large datagrams.
- Some use them only for the local network. Others will use them for any network
- on the same campus. 576 bytes is a "safe" size which every implementation must
- support.
-
- Ethernet Encapsulation: ARP
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- In Part One of Introduction to the Internet Protocols (Phrack Inc., Volume
- Three, Issue 28, File #3 of 12) there was a brief description about what IP
- datagrams look like on an Ethernet. The discription showed the Ethernet header
- and checksum, but it left one hole: It did not say how to figure out what
- Ethernet address to use when you want to talk to a given Internet address.
- There is a separate protocol for this called ARP ("address resolution
- protocol") and it is not an IP protocal as ARP datagrams do not have IP
- headers.
-
- Suppose you are on system 128.6.4.194 and you want to connect to system
- 128.6.4.7. Your system will first verify that 128.6.4.7 is on the same
- network, so it can talk directly via Ethernet. Then it will look up 128.6.4.7
- in its ARP table to see if it already knows the Ethernet address. If so, it
- will stick on an Ethernet header and send the packet. Now suppose this system
- is not in the ARP table. There is no way to send the packet because you need
- the Ethernet address. So it uses the ARP protocol to send an ARP request.
- Essentially an ARP request says "I need the Ethernet address for 128.6.4.7".
- Every system listens to ARP requests. When a system sees an ARP request for
- itself, it is required to respond. So 128.6.4.7 will see the request and will
- respond with an ARP reply saying in effect "128.6.4.7 is 8:0:20:1:56:34". Your
- system will save this information in its ARP table so future packets will go
- directly.
-
- ARP requests must be sent as "broadcasts." There is no way that an ARP request
- can be sent directly to the right system because the whole reason for sending
- an ARP request is that you do not know the Ethernet address. So an Ethernet
- address of all ones is used, i.e. ff:ff:ff:ff:ff:ff. By convention, every
- machine on the Ethernet is required to pay attention to packets with this as an
- address. So every system sees every ARP requests. They all look to see
- whether the request is for their own address. If so, they respond. If not,
- they could just ignore it, although some hosts will use ARP requests to update
- their knowledge about other hosts on the network, even if the request is not
- for them. Packets whose IP address indicates broadcast (e.g. 255.255.255.255
- or 128.6.4.255) are also sent with an Ethernet address that is all ones.
-
-
- Getting More Information
- ~~~~~~~~~~~~~~~~~~~~~~~~
- This directory contains documents describing the major protocols. There are
- hundreds of documents, so I have chosen the ones that seem most important.
- Internet standards are called RFCs (Request for Comments). A proposed standard
- is initially issued as a proposal, and given an RFC number. When it is finally
- accepted, it is added to Official Internet Protocols, but it is still referred
- to by the RFC number. I have also included two IENs (Internet Engineering
- Notes). IENs used to be a separate classification for more informal
- documents, but this classification no longer exists and RFCs are now used for
- all official Internet documents with a mailing list being used for more
- informal reports.
-
- The convention is that whenever an RFC is revised, the revised version gets a
- new number. This is fine for most purposes, but it causes problems with two
- documents: Assigned Numbers and Official Internet Protocols. These documents
- are being revised all the time and the RFC number keeps changing. You will
- have to look in rfc-index.txt to find the number of the latest edition. Anyone
- who is seriously interested in TCP/IP should read the RFC describing IP (791).
- RFC 1009 is also useful as it is a specification for gateways to be used by
- NSFnet and it contains an overview of a lot of the TCP/IP technology.
-
- Here is a list of the documents you might want:
-
- rfc-index List of all RFCs
- rfc1012 Somewhat fuller list of all RFCs
- rfc1011 Official Protocols. It's useful to scan this to see what tasks
- protocols have been built for. This defines which RFCs are
- actual standards, as opposed to requests for comments.
- rfc1010 Assigned Numbers. If you are working with TCP/IP, you will
- probably want a hardcopy of this as a reference. It lists all
- the offically defined well-known ports and lots of other
- things.
- rfc1009 NSFnet gateway specifications. A good overview of IP routing
- and gateway technology.
- rfc1001/2 NetBIOS: Networking for PCs
- rfc973 Update on domains
- rfc959 FTP (file transfer)
- rfc950 Subnets
- rfc937 POP2: Protocol for reading mail on PCs
- rfc894 How IP is to be put on Ethernet, see also rfc825
- rfc882/3 Domains (the database used to go from host names to Internet
- address and back -- also used to handle UUCP these days). See
- also rfc973
- rfc854/5 Telnet - Protocol for remote logins
- rfc826 ARP - Protocol for finding out Ethernet addresses
- rfc821/2 Mail
- rfc814 Names and ports - General concepts behind well-known ports
- rfc793 TCP
- rfc792 ICMP
- rfc791 IP
- rfc768 UDP
- rip.doc Details of the most commonly-used routing protocol
- ien-116 Old name server (still needed by several kinds of systems)
- ien-48 The Catenet model, general description of the philosophy behind
- TCP/IP
-
- The following documents are somewhat more specialized.
-
- rfc813 Window and acknowledgement strategies in TCP
- rfc815 Datagram reassembly techniques
- rfc816 Fault isolation and resolution techniques
- rfc817 Modularity and efficiency in implementation
- rfc879 The maximum segment size option in TCP
- rfc896 Congestion control
- rfc827,888,904,975,985 EGP and related issues
-
- The most important RFCs have been collected into a three-volume set, the DDN
- Protocol Handbook. It is available from the DDN Network Information Center at
- SRI International. You should be able to get them via anonymous FTP from
- SRI-NIC.ARPA. The file names are:
-
- RFCs:
- rfc:rfc-index.txt
- rfc:rfcxxx.txt
- IENs:
- ien:ien-index.txt
- ien:ien-xxx.txt
-
- Sites with access to UUCP, but not FTP may be able to retreive them via
- UUCP from UUCP host rutgers. The file names would be
-
- RFCs:
- /topaz/pub/pub/tcp-ip-docs/rfc-index.txt
- /topaz/pub/pub/tcp-ip-docs/rfcxxx.txt
- IENs:
- /topaz/pub/pub/tcp-ip-docs/ien-index.txt
- /topaz/pub/pub/tcp-ip-docs/ien-xxx.txt
-
- >--------=====END=====--------<
-