home *** CD-ROM | disk | FTP | other *** search
-
- == Phrack Magazine ==
- Volume Seven, Issue Forty-Eight
- by daemon9 / route / infinity
- for Phrack Magazine
- June 1996 Guild Productions, kid
-
- comments to route@infonexus.com
-
- The purpose of this paper is to explain IP-spoofing to the masses. It
- assumes little more than a working knowledge of Unix and TCP/IP. Oh, and
- that yur not a moron...
-
- IP-spoofing is complex technical attack that is made up of several
- components. (In actuality, IP-spoofing is not the attack, but a step in the
- attack. The attack is actually trust-relationship exploitation. However, in
- this paper, IP-spoofing will refer to the whole attack.) In this paper, I
- will explain the attack in detail, including the relevant operating system
- and networking information.
-
- SECTION I. BACKGROUND INFORMATION
-
- --[ The Players ]--
-
- A: Target host
-
- B: Trusted host
-
- X: Unreachable host
-
- Z: Attacking host
-
- (1)2: Host 1 masquerading as host 2
-
- --[ The Figures ]--
-
- There are several figures in the paper and they are to be interpreted as
- per the following example:
-
- ick host a control host b
-
- 1 A ---SYN---> B
-
- tick: A tick of time. There is no distinction made as to how much time
- passes between ticks, just that time passes. It's generally not a great
- deal.
- host a: A machine particpating in a TCP-based conversation. control: This
- field shows any relevant control bits set in the TCP header and the
- direction the data is flowing
- host b: A machine particpating in a TCP-based conversation.
-
- In this case, at the first refrenced point in time host a is sending a TCP
- segment to host b with the SYN bit on. Unless stated, we are generally not
- concerned with the data portion of the TCP segment.
-
- --[ Trust Relationships ]--
-
- In the Unix world, trust can be given all too easily. Say you have an
- account on machine A, and on machine B. To facilitate going betwixt the two
- with a minimum amount of hassle, you want to setup a full-duplex trust
- relationship between them. In your home directory at A you create a .rhosts
- file: `echo "B username" > ~/.rhosts` In your home directory at B you
- create a .rhosts file: `echo "A username" > ~/.rhosts` (Alternately, root
- can setup similar rules in /etc/hosts.equiv, the difference being that the
- rules are hostwide, rather than just on an individual basis.) Now, you can
- use any of the r* commands without that annoying hassle of password
- authentication. These commands will allow address-based authentication,
- which will grant or deny access based off of the IP address of the service
- requestor.
-
- --[ Rlogin ]--
-
- Rlogin is a simple client-server based protocol that uses TCP as it's
- transport. Rlogin allows a user to login remotely from one host to another,
- and, if the target machine trusts the other, rlogin will allow the
- convienience of not prompting for a password. It will instead have
- authenticated the client via the source IP address. So, from our example
- above, we can use rlogin to remotely login to A from B (or vice-versa) and
- not be prompted for a password.
-
- --[ Internet Protocol ]--
-
- IP is the connectionless, unreliable network protocol in the TCP/IP suite.
- It has two 32-bit header fields to hold address information. IP is also the
- busiest of all the TCP/IP protocols as almost all TCP/IP traffic is
- encapsulated in IP datagrams. IP's job is to route packets around the
- network. It provides no mechanism for reliability or accountability, for
- that, it relies on the upper layers. IP simply sends out datagrams and
- hopes they make it intact. If they don't, IP can try to send an ICMP error
- message back to the source, however this packet can get lost as well. (ICMP
- is Internet Control Message Protocol and it is used to relay network
- conditions and different errors to IP and the other layers.) IP has no
- means to guarantee delivery. Since IP is connectionless, it does not
- maintain any connection state information. Each IP datagram is sent out
- without regard to the last one or the next one. This, along with the fact
- that it is trivial to modify the IP stack to allow an arbitrarily choosen
- IP address in the source (and destination) fields make IP easily
- subvertable.
-
- --[ Transmission Control Protocol ]--
-
- TCP is the connection-oriented, reliable transport protocol in the TCP/IP
- suite. Connection-oriented simply means that the two hosts participating in
- a discussion must first establish a connection before data may change
- hands. Reliability is provided in a number of ways but the only two we are
- concerned with are data sequencing and acknowledgement. TCP assigns
- sequence numbers to every segment and acknowledges any and all data
- segments recieved from the other end. (ACK's consume a sequence number, but
- are not themselves ACK'd.) This reliability makes TCP harder to fool than
- IP.
-
- --[ Sequence Numbers, Acknowledgements and other flags ]--
-
- Since TCP is reliable, it must be able to recover from lost, duplicated, or
- out-of-order data. By assigning a sequence number to every byte transfered,
- and requiring an acknowledgement from the other end upon receipt, TCP can
- guarantee reliable delivery. The receiving end uses the sequence numbers to
- ensure proper ordering of the data and to eliminate duplicate data bytes.
- TCP sequence numbers can simply be thought of as 32-bit counters. They
- range from 0 to 4,294,967,295. Every byte of data exchanged across a TCP
- connection (along with certain flags) is sequenced. The sequence number
- field in the TCP header will contain the sequence number of the first byte
- of data in the TCP segment. The acknowledgement number field in the TCP
- header holds the value of next expected sequence number, and also
- acknowledges all data up through this ACK number minus one.
- TCP uses the concept of window advertisement for flow control. It uses a
- sliding window to tell the other end how much data it can buffer. Since the
- window size is 16-bits a receiving TCP can advertise up to a maximum of
- 65535 bytes. Window advertisement can be thought of an advertisment from
- one TCP to the other of how high acceptable sequence numbers can be.
- Other TCP header flags of note are RST (reset), PSH (push) and FIN
- (finish). If a RST is received, the connection is immediately torn down.
- RSTs are normally sent when one end receives a segment that just doesn't
- jive with current connection (we will encounter an example below). The PSH
- flag tells the reciever to pass all the data is has queued to the
- aplication, as soon as possible. The FIN flag is the way an application
- begins a graceful close of a connection (connection termination is a 4-way
- process). When one end recieves a FIN, it ACKs it, and does not expect to
- receive any more data (sending is still possible, however).
-
- --[ TCP Connection Establishment ]--
-
- In order to exchange data using TCP, hosts must establish a a connection.
- TCP establishes a connection in a 3 step process called the 3-way
- handshake. If machine A is running an rlogin client and wishes to conect to
- an rlogin daemon on machine B, the process is as follows:
-
- fig(1)
-
- 1 A ---SYN---> B
-
- 2 A <---SYN/ACK--- B
-
- 3 A ---ACK---> B
-
- At (1) the client is telling the server that it wants a connection. This is
- the SYN flag's only purpose. The client is telling the server that the
- sequence number field is valid, and should be checked. The client will set
- the sequence number field in the TCP header to it's ISN (initial sequence
- number). The server, upon receiving this segment (2) will respond with it's
- own ISN (therefore the SYN flag is on) and an ACKnowledgement of the
- clients first segment (which is the client's ISN+1). The client then ACK's
- the server's ISN (3). Now, data transfer may take place.
-
- --[ The ISN and Sequence Number Incrementation ]--
-
- It is important to understand how sequence numbers are initially choosen,
- and how they change with respect to time. The initial sequence number when
- a host is bootstraped is initialized to 1. (TCP actually calls this
- variable 'tcp_iss' as it is the initial send sequence number. The other
- sequence number variable, 'tcp_irs' is the initial receive sequence number
- and is learned during the 3-way connection establishment. We are not going
- to worry about the distinction.) This practice is wrong, and is
- acknowledged as so in a comment the tcp_init() function where it appears.
- The ISN is incremented by 128,000 every second, which causes the 32-bit ISN
- counter to wrap every 9.32 hours if no connections occur. However, each
- time a connect() is issued, the counter is incremented by 64,000.
- One important reason behind this predictibility is to minimize the chance
- that data from an older stale incarnation (that is, from the same 4-tuple
- of the local and remote IP-addresses TCP ports) of the current connection
- could arrive and foul things up. The concept of the 2MSL wait time applies
- here, but is beyond the scope of this paper. If sequence numbers were
- choosen at random when a connection arrived, no guarantees could be made
- that the sequence numbers would be different from a previous incarnation.
- If some data that was stuck in a routing loop somewhere finally freed
- itself and wandered into the new incarnation of it's old connection, it
- could really foul things up.
-
- --[ Ports ]--
-
- To grant simultaneous access to the TCP module, TCP provides a user
- interface called a port. Ports are used by the kernel to identify network
- processes. These are strictly transport layer entities (that is to say that
- IP could care less about them). Together with an IP address, a TCP port
- provides provides an endpoint for network communications. In fact, at any
- given moment all Internet connections can be described by 4 numbers: the
- source IP address and source port and the destination IP address and
- destination port. Servers are bound to 'well-known' ports so that they may
- be located on a standard port on different systems. For example, the rlogin
- daemon sits on TCP port 513.
-
- SECTION II. THE ATTACK
-
- ...The devil finds work for idle hands....
-
- --[ Briefly... ]--
-
- IP-spoofing consists of several steps, which I will briefly outline here,
- then explain in detail. First, the target host is choosen. Next, a pattern
- of trust is discovered, along with a trusted host. The trusted host is then
- disabled, and the target's TCP sequence numbers are sampled. The trusted
- host is impersonated, the sequence numbers guessed, and a connection
- attempt is made to a service that only requires address-based
- authentication. If successful, the attacker executes a simple command to
- leave a backdoor.
-
- --[ Needful Things ]--
-
- There are a couple of things one needs to wage this attack:
-
- * brain, mind, or other thinking device
- * target host
- * trusted host
- * attacking host (with root access)
- * IP-spoofing software
-
- Generally the attack is made from the root account on the attacking host
- against the root account on the target. If the attacker is going to all
- this trouble, it would be stupid not to go for root. (Since root access is
- needed to wage the attack, this should not be an issue.)
-
- --[ IP-Spoofing is a 'Blind Attack' ]--
-
- One often overlooked, but critical factor in IP-spoofing is the fact that
- the attack is blind. The attacker is going to be taking over the identity
- of a trusted host in order to subvert the security of the target host. The
- trusted host is disabled using the method described below. As far as the
- target knows, it is carrying on a conversation with a trusted pal. In
- reality, the attacker is sitting off in some dark corner of the Internet,
- forging packets puportedly from this trusted host while it is locked up in
- a denial of service battle. The IP datagrams sent with the forged
- IP-address reach the target fine (recall that IP is a
- connectionless-oriented protocol-- each datagram is sent without regard for
- the other end) but the datagrams the target sends back (destined for the
- trusted host) end up in the bit-bucket. The attacker never sees them. The
- intervening routers know where the datagrams are supposed to go. They are
- supposed to go the trusted host. As far as the network layer is concerned,
- this is where they originally came from, and this is where responses should
- go. Of course once the datagrams are routed there, and the information is
- demultiplexed up the protocol stack, and reaches TCP, it is discarded (the
- trusted host's TCP cannot respond-- see below). So the attacker has to be
- smart and know what was sent, and know what reponse the server is looking
- for. The attacker cannot see what the target host sends, but she can
- predict what it will send; that coupled with the knowledge of what it will
- send, allows the attacker to work around this blindness.
-
- --[ Patterns of Trust ]--
-
- After a target is choosen the attacker must determine the patterns of trust
- (for the sake of argument, we are going to assume the target host does in
- fact trust somebody. If it didn't, the attack would end here). Figuring out
- who a host trusts may or may not be easy. A 'showmount -e' may show where
- filesystems are exported, and rpcinfo can give out valuable information as
- well. If enough background information is known about the host, it should
- not be too difficult. If all else fails, trying neighboring IP addresses in
- a brute force effort may be a viable option.
-
- --[ Trusted Host Disabling Using the Flood of Sins ]--
-
- Once the trusted host is found, it must be disabled. Since the attacker is
- going to impersonate it, she must make sure this host cannot receive any
- network traffic and foul things up. There are many ways of doing this, the
- one I am going to discuss is TCP SYN flooding.
- A TCP connection is initiated with a client issuing a request to a server
- with the SYN flag on in the TCP header. Normally the server will issue a
- SYN/ACK back to the client identified by the 32-bit source address in the
- IP header. The client will then send an ACK to the server (as we saw in
- figure 1 above) and data transfer can commence. There is an upper limit of
- how many concurrent SYN requests TCP can process for a given socket,
- however. This limit is called the backlog, and it is the length of the
- queue where incoming (as yet incomplete) connections are kept. This queue
- limit applies to both the number of imcomplete connections (the 3-way
- handshake is not complete) and the number of completed connections that
- have not been pulled from the queue by the application by way of the
- accept() system call. If this backlog limit is reached, TCP will silently
- discard all incoming SYN requests until the pending connections can be
- dealt with. Therein lies the attack.
- The attacking host sends several SYN requests to the TCP port she desires
- disabled. The attacking host also must make sure that the source IP-address
- is spoofed to be that of another, currently unreachable host (the target
- TCP will be sending it's response to this address. (IP may inform TCP that
- the host is unreachable, but TCP considers these errors to be transient and
- leaves the resolution of them up to IP (reroute the packets, etc)
- effectively ignoring them.) The IP-address must be unreachable because the
- attacker does not want any host to recieve the SYN/ACKs that will be coming
- from the target TCP (this would result in a RST being sent to the target
- TCP, which would foil our attack). The process is as follows:
-
- fig(2)
-
- 1 Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- ...
-
- 2 X <---SYN/ACK--- B
-
- X <---SYN/ACK--- B
-
- ...
-
- 3 X <---RST--- B
-
- At (1) the attacking host sends a multitude of SYN requests to the target
- (remember the target in this phase of the attack is the trusted host) to
- fill it's backlog queue with pending connections. (2) The target responds
- with SYN/ACKs to what it believes is the source of the incoming SYNs.
- During this time all further requests to this TCP port will be ignored.
- Different TCP implementations have different backlog sizes. BSD generally
- has a backlog of 5 (Linux has a backlog of 6). There is also a 'grace'
- margin of 3/2. That is, TCP will allow up to backlog*3/2+1 connections.
- This will allow a socket one connection even if it calls listen with a
- backlog of 0.
-
- AuthNote: [For a much more in-depth treatment of TCP SYN flooding, see my
- definitive paper on the subject. It covers the whole process in detail, in
- both theory, and practice. There is robust working code, a statistical
- analysis, and a legnthy paper. Look for it in issue 49 of Phrack. -daemon9
- 6/96]
-
- --[ Sequence Number Sampling and Prediction ]--
-
- Now the attacker needs to get an idea of where in the 32-bit sequence
- number space the target's TCP is. The attacker connects to a TCP port on
- the target (SMTP is a good choice) just prior to launching the attack and
- completes the three-way handshake. The process is exactly the same as
- fig(1), except that the attacker will save the value of the ISN sent by the
- target host. Often times, this process is repeated several times and the
- final ISN sent is stored. The attacker needs to get an idea of what the RTT
- (round-trip time) from the target to her host is like. (The process can be
- repeated several times, and an average of the RTT's is calculated.) The RTT
- is necessary in being able to accuratly predict the next ISN. The attacker
- has the baseline (the last ISN sent) and knows how the sequence numbers are
- incremented (128,000/second and 64,000 per connect) and now has a good idea
- of how long it will take an IP datagram to travel across the Internet to
- reach the target (approximately half the RTT, as most times the routes are
- symmetrical). After the attacker has this information, she immediately
- proceeds to the next phase of the attack (if another TCP connection were to
- arrive on any port of the target before the attacker was able to continue
- the attack, the ISN predicted by the attacker would be off by 64,000 of
- what was predicted).
- When the spoofed segment makes it's way to the target, several different
- things may happen depending on the accuracy of the attacker's prediction:
-
- * If the sequence number is EXACTly where the receiving TCP expects it
- to be, the incoming data will be placed on the next available position
- in the receive buffer.
- * If the sequence number is LESS than the expected value the data byte
- is considered a retransmission, and is discarded.
- * If the sequence number is GREATER than the expected value but still
- within the bounds of the receive window, the data byte is considered
- to be a future byte, and is held by TCP, pending the arrival of the
- other missing bytes. If a segment arrives with a sequence number
- GREATER than the expected value and NOT within the bounds of the
- receive window the segment is dropped, and TCP will send a segment
- back with the expected sequence number.
-
- --[ Subversion... ]--
-
- Here is where the main thrust of the attack begins:
-
- fig(3)
-
- 1 Z(b) ---SYN---> A
-
- 2 B <---SYN/ACK--- A
-
- 3 Z(b) ---ACK---> A
-
- 4 Z(b) ---PSH---> A
-
- [...]
-
- The attacking host spoofs her IP address to be that of the trusted host
- (which should still be in the death-throes of the D.O.S. attack) and sends
- it's connection request to port 513 on the target (1). At (2), the target
- responds to the spoofed connection request with a SYN/ACK, which will make
- it's way to the trusted host (which, if it could process the incoming TCP
- segment, it would consider it an error, and immediately send a RST to the
- target). If everything goes according to plan, the SYN/ACK will be dropped
- by the gagged trusted host. After (1), the attacker must back off for a bit
- to give the target ample time to send the SYN/ACK (the attacker cannot see
- this segment). Then, at (3) the attacker sends an ACK to the target with
- the predicted sequence number (plus one, because we're ACKing it). If the
- attacker is correct in her prediction, the target will accept the ACK. The
- target is compromised and data transfer can commence (4).
- Generally, after compromise, the attacker will insert a backdoor into the
- system that will allow a simpler way of intrusion. (Often a `cat + + >>
- ~/.rhosts` is done. This is a good idea for several reasons: it is quick,
- allows for simple re-entry, and is not interactive. Remember the attacker
- cannot see any traffic coming from the target, so any reponses are sent off
- into oblivion.)
-
- --[ Why it Works ]--
-
- IP-Spoofing works because trusted services only rely on network address
- based authentication. Since IP is easily duped, address forgery is not
- difficult. The hardest part of the attck is in the sequence number
- prediction, because that is where the guesswork comes into play. Reduce
- unknowns and guesswork to a minimum, and the attack has a better chance of
- suceeding. Even a machine that wraps all it's incoming TCP bound
- connections with Wietse Venema's TCP wrappers, is still vulnerable to the
- attack. TCP wrappers rely on a hostname or an IP address for
- authentication...
-
- SECTION III. PREVENTITIVE MEASURES
-
- ...A stich in time, saves nine...
-
- --[ Be Un-trusting and Un-trustworthy ]--
-
- One easy solution to prevent this attack is not to rely on address-based
- authentication. Disable all the r* commands, remove all .rhosts files and
- empty out the /etc/hosts.equiv file. This will force all users to use other
- means of remote access (telnet, ssh, skey, etc).
-
- --[ Packet Filtering ]--
-
- If your site has a direct connect to the Internet, you can use your router
- to help you out. First make sure only hosts on your internal LAN can
- particpate in trust-relationships (no internal host should trust a host
- outside the LAN). Then simply filter out all traffic from the outside (the
- Internet) that puports to come from the inside (the LAN).
-
- --[ Cryptographic Methods ]--
-
- An obvious method to deter IP-spoofing is to require all network traffic to
- be encrypted and/or authenticated. While several solutions exist, it will
- be a while before such measures are deployed as defacto standards.
-
- --[ Initial Sequence Number Randomizing ]--
-
- Since the sequence numbers are not choosen randomly (or incremented
- randomly) this attack works. Bellovin describes a fix for TCP that involves
- partitioning the sequence number space. Each connection would have it's own
- seperate sequence number space. The sequence numbers would still be
- incremented as before, however, there would be no obvious or implied
- relationship between the numbering in these spaces. Suggested is the
- following formula:
-
- ISN=M+F(localhost,localport,remotehost,remoteport)
-
- Where M is the 4 microsecond timer and F is a cryptographic hash. F must
- not be computable from the outside or the attacker could still guess
- sequence numbers. Bellovin suggests F be a hash of the connection-id and a
- secret vector (a random number, or a host related secret combined with the
- machine's boot time).
-
- SECTION IV. SOURCES
-
- -Books: TCP/IP Illustrated vols. I, II & III
-
- -RFCs: 793, 1825, 1948
-
- -People: Richard W. Stevens, and the users of the
-
- Information Nexus for proofreading
-
- -Sourcecode: rbone, mendax, SYNflood
-
- This paper made possible by a grant from the Guild Corporation.
-