home *** CD-ROM | disk | FTP | other *** search
-
- [ IP-spoofing Demystified ]
- (Trust-Relationship Exploitation)
-
- by daemon9 / route / infinity
- for Phrack Magazine
- June 1996 Guild Productions, kid
-
- comments to route@infonexus.com
-
- The purpose of this paper is to explain IP-spoofing to the
- masses. It assumes little more than a working knowledge of Unix and
- TCP/IP. Oh, and that yur not a moron...
- IP-spoofing is complex technical attack that is made up of
- several components. (In actuality, IP-spoofing is not the attack, but
- a step in the attack. The attack is actually trust-relationship
- exploitation. However, in this paper, IP-spoofing will refer to the
- whole attack.) In this paper, I will explain the attack in detail,
- including the relevant operating system and networking information.
-
- [SECTION I. BACKGROUND INFORMATION]
-
- --[ The Players ]--
-
- A: Target host
- B: Trusted host
- X: Unreachable host
- Z: Attacking host
- (1)2: Host 1 masquerading as host 2
-
- --[ The Figures ]--
-
- There are several figures in the paper and they are to be
- interpreted as per the following example:
-
- ick host a control host b
- 1 A ---SYN---> B
-
- tick: A tick of time. There is no distinction made as to *how*
- much time passes between ticks, just that time passes. It's generally
- not a great deal.
- host a: A machine particpating in a TCP-based conversation.
- control: This field shows any relevant control bits set in the TCP
- header and the direction the data is flowing
- host b: A machine particpating in a TCP-based conversation.
-
- In this case, at the first refrenced point in time host a is sending
- a TCP segment to host b with the SYN bit on. Unless stated, we are
- generally not concerned with the data portion of the TCP segment.
-
- --[ Trust Relationships ]--
-
- In the Unix world, trust can be given all too easily. Say you
- have an account on machine A, and on machine B. To facilitate going
- betwixt the two with a minimum amount of hassle, you want to setup a
- full-duplex trust relationship between them. In your home directory
- at A you create a .rhosts file: `echo "B username" > ~/.rhosts` In
- your home directory at B you create a .rhosts file: `echo "A username"
- > ~/.rhosts` (Alternately, root can setup similar rules in
- /etc/hosts.equiv, the difference being that the rules are hostwide,
- rather than just on an individual basis.) Now, you can use any of the
- r* commands without that annoying hassle of password authentication.
- These commands will allow address-based authentication, which will
- grant or deny access based off of the IP address of the service
- requestor.
-
- --[ Rlogin ]--
-
- Rlogin is a simple client-server based protocol that uses TCP
- as it's transport. Rlogin allows a user to login remotely from one
- host to another, and, if the target machine trusts the other, rlogin
- will allow the convienience of not prompting for a password. It will
- instead have authenticated the client via the source IP address. So,
- from our example above, we can use rlogin to remotely login to A from
- B (or vice-versa) and not be prompted for a password.
-
- --[ Internet Protocol ]--
-
- IP is the connectionless, unreliable network protocol in the
- TCP/IP suite. It has two 32-bit header fields to hold address
- information. IP is also the busiest of all the TCP/IP protocols as
- almost all TCP/IP traffic is encapsulated in IP datagrams. IP's job
- is to route packets around the network. It provides no mechanism for
- reliability or accountability, for that, it relies on the upper
- layers. IP simply sends out datagrams and hopes they make it intact.
- If they don't, IP can try to send an ICMP error message back to the
- source, however this packet can get lost as well. (ICMP is Internet
- Control Message Protocol and it is used to relay network conditions
- and different errors to IP and the other layers.) IP has no means to
- guarantee delivery. Since IP is connectionless, it does not maintain
- any connection state information. Each IP datagram is sent out without
- regard to the last one or the next one. This, along with the fact that
- it is trivial to modify the IP stack to allow an arbitrarily choosen IP
- address in the source (and destination) fields make IP easily subvertable.
-
- --[ Transmission Control Protocol ]--
-
- TCP is the connection-oriented, reliable transport protocol
- in the TCP/IP suite. Connection-oriented simply means that the two
- hosts participating in a discussion must first establish a connection
- before data may change hands. Reliability is provided in a number of
- ways but the only two we are concerned with are data sequencing and
- acknowledgement. TCP assigns sequence numbers to every segment and
- acknowledges any and all data segments recieved from the other end.
- (ACK's consume a sequence number, but are not themselves ACK'd.)
- This reliability makes TCP harder to fool than IP.
-
- --[ Sequence Numbers, Acknowledgements and other flags ]--
-
- Since TCP is reliable, it must be able to recover from
- lost, duplicated, or out-of-order data. By assigning a sequence
- number to every byte transfered, and requiring an acknowledgement from
- the other end upon receipt, TCP can guarantee reliable delivery. The
- receiving end uses the sequence numbers to ensure proper ordering of
- the data and to eliminate duplicate data bytes.
- TCP sequence numbers can simply be thought of as 32-bit
- counters. They range from 0 to 4,294,967,295. Every byte of
- data exchanged across a TCP connection (along with certain flags)
- is sequenced. The sequence number field in the TCP header will
- contain the sequence number of the *first* byte of data in the
- TCP segment. The acknowledgement number field in the TCP header
- holds the value of next *expected* sequence number, and also
- acknowledges *all* data up through this ACK number minus one.
- TCP uses the concept of window advertisement for flow
- control. It uses a sliding window to tell the other end how much
- data it can buffer. Since the window size is 16-bits a receiving TCP
- can advertise up to a maximum of 65535 bytes. Window advertisement
- can be thought of an advertisment from one TCP to the other of how
- high acceptable sequence numbers can be.
- Other TCP header flags of note are RST (reset), PSH (push)
- and FIN (finish). If a RST is received, the connection is
- immediately torn down. RSTs are normally sent when one end
- receives a segment that just doesn't jive with current connection
- (we will encounter an example below). The PSH flag tells the
- reciever to pass all the data is has queued to the aplication, as
- soon as possible. The FIN flag is the way an application begins a
- graceful close of a connection (connection termination is a 4-way
- process). When one end recieves a FIN, it ACKs it, and does not
- expect to receive any more data (sending is still possible, however).
-
- --[ TCP Connection Establishment ]--
-
- In order to exchange data using TCP, hosts must establish a
- a connection. TCP establishes a connection in a 3 step process called
- the 3-way handshake. If machine A is running an rlogin client and
- wishes to conect to an rlogin daemon on machine B, the process is as
- follows:
-
- fig(1)
-
- 1 A ---SYN---> B
-
- 2 A <---SYN/ACK--- B
-
- 3 A ---ACK---> B
-
- At (1) the client is telling the server that it wants a connection.
- This is the SYN flag's only purpose. The client is telling the
- server that the sequence number field is valid, and should be checked.
- The client will set the sequence number field in the TCP header to
- it's ISN (initial sequence number). The server, upon receiving this
- segment (2) will respond with it's own ISN (therefore the SYN flag is
- on) and an ACKnowledgement of the clients first segment (which is the
- client's ISN+1). The client then ACK's the server's ISN (3). Now,
- data transfer may take place.
-
- --[ The ISN and Sequence Number Incrementation ]--
-
- It is important to understand how sequence numbers are
- initially choosen, and how they change with respect to time. The
- initial sequence number when a host is bootstraped is initialized
- to 1. (TCP actually calls this variable 'tcp_iss' as it is the initial
- *send* sequence number. The other sequence number variable,
- 'tcp_irs' is the initial *receive* sequence number and is learned
- during the 3-way connection establishment. We are not going to worry
- about the distinction.) This practice is wrong, and is acknowledged
- as so in a comment the tcp_init() function where it appears. The ISN
- is incremented by 128,000 every second, which causes the 32-bit ISN
- counter to wrap every 9.32 hours if no connections occur. However,
- each time a connect() is issued, the counter is incremented by
- 64,000.
- One important reason behind this predictibility is to
- minimize the chance that data from an older stale incarnation
- (that is, from the same 4-tuple of the local and remote
- IP-addresses TCP ports) of the current connection could arrive
- and foul things up. The concept of the 2MSL wait time applies
- here, but is beyond the scope of this paper. If sequence
- numbers were choosen at random when a connection arrived, no
- guarantees could be made that the sequence numbers would be different
- from a previous incarnation. If some data that was stuck in a
- routing loop somewhere finally freed itself and wandered into the new
- incarnation of it's old connection, it could really foul things up.
-
- --[ Ports ]--
-
- To grant simultaneous access to the TCP module, TCP provides
- a user interface called a port. Ports are used by the kernel to
- identify network processes. These are strictly transport layer
- entities (that is to say that IP could care less about them).
- Together with an IP address, a TCP port provides provides an endpoint
- for network communications. In fact, at any given moment *all*
- Internet connections can be described by 4 numbers: the source IP
- address and source port and the destination IP address and destination
- port. Servers are bound to 'well-known' ports so that they may be
- located on a standard port on different systems. For example, the
- rlogin daemon sits on TCP port 513.
-
- [SECTION II. THE ATTACK]
-
- ...The devil finds work for idle hands....
-
- --[ Briefly... ]--
-
- IP-spoofing consists of several steps, which I will
- briefly outline here, then explain in detail. First, the target host
- is choosen. Next, a pattern of trust is discovered, along with a
- trusted host. The trusted host is then disabled, and the target's TCP
- sequence numbers are sampled. The trusted host is impersonated, the
- sequence numbers guessed, and a connection attempt is made to a
- service that only requires address-based authentication. If
- successful, the attacker executes a simple command to leave a
- backdoor.
-
- --[ Needful Things ]--
-
- There are a couple of things one needs to wage this attack:
-
- (1) brain, mind, or other thinking device
- (1) target host
- (1) trusted host
- (1) attacking host (with root access)
- (1) IP-spoofing software
-
- Generally the attack is made from the root account on the attacking
- host against the root account on the target. If the attacker is
- going to all this trouble, it would be stupid not to go for root.
- (Since root access is needed to wage the attack, this should not
- be an issue.)
-
- --[ IP-Spoofing is a 'Blind Attack' ]--
-
- One often overlooked, but critical factor in IP-spoofing
- is the fact that the attack is blind. The attacker is going to be
- taking over the identity of a trusted host in order to subvert the
- security of the target host. The trusted host is disabled using the
- method described below. As far as the target knows, it is carrying on
- a conversation with a trusted pal. In reality, the attacker is
- sitting off in some dark corner of the Internet, forging packets
- puportedly from this trusted host while it is locked up in a denial
- of service battle. The IP datagrams sent with the forged IP-address
- reach the target fine (recall that IP is a connectionless-oriented
- protocol-- each datagram is sent without regard for the other end)
- but the datagrams the target sends back (destined for the trusted
- host) end up in the bit-bucket. The attacker never sees them. The
- intervening routers know where the datagrams are supposed to go. They
- are supposed to go the trusted host. As far as the network layer is
- concerned, this is where they originally came from, and this is where
- responses should go. Of course once the datagrams are routed there,
- and the information is demultiplexed up the protocol stack, and
- reaches TCP, it is discarded (the trusted host's TCP cannot respond--
- see below). So the attacker has to be smart and *know* what was sent,
- and *know* what reponse the server is looking for. The attacker
- cannot see what the target host sends, but she can *predict* what it
- will send; that coupled with the knowledge of what it *will* send,
- allows the attacker to work around this blindness.
-
- --[ Patterns of Trust ]--
-
- After a target is choosen the attacker must determine the
- patterns of trust (for the sake of argument, we are going to assume
- the target host *does* in fact trust somebody. If it didn't, the
- attack would end here). Figuring out who a host trusts may or may
- not be easy. A 'showmount -e' may show where filesystems are
- exported, and rpcinfo can give out valuable information as well.
- If enough background information is known about the host, it should
- not be too difficult. If all else fails, trying neighboring IP
- addresses in a brute force effort may be a viable option.
-
- --[ Trusted Host Disabling Using the Flood of Sins ]--
-
- Once the trusted host is found, it must be disabled. Since
- the attacker is going to impersonate it, she must make sure this host
- cannot receive any network traffic and foul things up. There are
- many ways of doing this, the one I am going to discuss is TCP SYN
- flooding.
- A TCP connection is initiated with a client issuing a
- request to a server with the SYN flag on in the TCP header. Normally
- the server will issue a SYN/ACK back to the client identified by the
- 32-bit source address in the IP header. The client will then send an
- ACK to the server (as we saw in figure 1 above) and data transfer
- can commence. There is an upper limit of how many concurrent SYN
- requests TCP can process for a given socket, however. This limit
- is called the backlog, and it is the length of the queue where
- incoming (as yet incomplete) connections are kept. This queue limit
- applies to both the number of imcomplete connections (the 3-way
- handshake is not complete) and the number of completed connections
- that have not been pulled from the queue by the application by way of
- the accept() system call. If this backlog limit is reached, TCP will
- silently discard all incoming SYN requests until the pending
- connections can be dealt with. Therein lies the attack.
- The attacking host sends several SYN requests to the TCP port
- she desires disabled. The attacking host also must make sure that
- the source IP-address is spoofed to be that of another, currently
- unreachable host (the target TCP will be sending it's response to
- this address. (IP may inform TCP that the host is unreachable,
- but TCP considers these errors to be transient and leaves the
- resolution of them up to IP (reroute the packets, etc) effectively
- ignoring them.) The IP-address must be unreachable because the
- attacker does not want any host to recieve the SYN/ACKs that will be
- coming from the target TCP (this would result in a RST being sent to
- the target TCP, which would foil our attack). The process is as
- follows:
-
- fig(2)
-
- 1 Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- Z(x) ---SYN---> B
-
- ...
-
- 2 X <---SYN/ACK--- B
-
- X <---SYN/ACK--- B
-
- ...
-
- 3 X <---RST--- B
-
- At (1) the attacking host sends a multitude of SYN requests to the
- target (remember the target in this phase of the attack is the
- trusted host) to fill it's backlog queue with pending connections.
- (2) The target responds with SYN/ACKs to what it believes is the
- source of the incoming SYNs. During this time all further requests
- to this TCP port will be ignored.
- Different TCP implementations have different backlog sizes.
- BSD generally has a backlog of 5 (Linux has a backlog of 6). There
- is also a 'grace' margin of 3/2. That is, TCP will allow up to
- backlog*3/2+1 connections. This will allow a socket one connection
- even if it calls listen with a backlog of 0.
-
- AuthNote: [For a much more in-depth treatment of TCP SYN
- flooding, see my definitive paper on the subject. It covers the
- whole process in detail, in both theory, and practice. There is
- robust working code, a statistical analysis, and a legnthy paper.
- Look for it in issue 49 of Phrack. -daemon9 6/96]
-
- --[ Sequence Number Sampling and Prediction ]--
-
- Now the attacker needs to get an idea of where in the 32-bit
- sequence number space the target's TCP is. The attacker connects to
- a TCP port on the target (SMTP is a good choice) just prior to launching
- the attack and completes the three-way handshake. The process is
- exactly the same as fig(1), except that the attacker will save the
- value of the ISN sent by the target host. Often times, this process is
- repeated several times and the final ISN sent is stored. The attacker
- needs to get an idea of what the RTT (round-trip time) from the target
- to her host is like. (The process can be repeated several times, and an
- average of the RTT's is calculated.) The RTT is necessary in being
- able to accuratly predict the next ISN. The attacker has the baseline
- (the last ISN sent) and knows how the sequence numbers are incremented
- (128,000/second and 64,000 per connect) and now has a good idea of
- how long it will take an IP datagram to travel across the Internet to
- reach the target (approximately half the RTT, as most times the
- routes are symmetrical). After the attacker has this information, she
- immediately proceeds to the next phase of the attack (if another TCP
- connection were to arrive on any port of the target before the
- attacker was able to continue the attack, the ISN predicted by the
- attacker would be off by 64,000 of what was predicted).
- When the spoofed segment makes it's way to the target,
- several different things may happen depending on the accuracy of
- the attacker's prediction:
- - If the sequence number is EXACTly where the receiving TCP expects
- it to be, the incoming data will be placed on the next available
- position in the receive buffer.
- - If the sequence number is LESS than the expected value the data
- byte is considered a retransmission, and is discarded.
- - If the sequence number is GREATER than the expected value but
- still within the bounds of the receive window, the data byte is
- considered to be a future byte, and is held by TCP, pending the
- arrival of the other missing bytes. If a segment arrives with a
- sequence number GREATER than the expected value and NOT within the
- bounds of the receive window the segment is dropped, and TCP will
- send a segment back with the *expected* sequence number.
-
- --[ Subversion... ]--
-
- Here is where the main thrust of the attack begins:
-
- fig(3)
-
- 1 Z(b) ---SYN---> A
-
- 2 B <---SYN/ACK--- A
-
- 3 Z(b) ---ACK---> A
-
- 4 Z(b) ---PSH---> A
-
- [...]
-
- The attacking host spoofs her IP address to be that of the trusted
- host (which should still be in the death-throes of the D.O.S. attack)
- and sends it's connection request to port 513 on the target (1). At
- (2), the target responds to the spoofed connection request with a
- SYN/ACK, which will make it's way to the trusted host (which, if it
- *could* process the incoming TCP segment, it would consider it an
- error, and immediately send a RST to the target). If everything goes
- according to plan, the SYN/ACK will be dropped by the gagged trusted
- host. After (1), the attacker must back off for a bit to give the
- target ample time to send the SYN/ACK (the attacker cannot see this
- segment). Then, at (3) the attacker sends an ACK to the target with
- the predicted sequence number (plus one, because we're ACKing it).
- If the attacker is correct in her prediction, the target will accept
- the ACK. The target is compromised and data transfer can
- commence (4).
- Generally, after compromise, the attacker will insert a
- backdoor into the system that will allow a simpler way of intrusion.
- (Often a `cat + + >> ~/.rhosts` is done. This is a good idea for
- several reasons: it is quick, allows for simple re-entry, and is not
- interactive. Remember the attacker cannot see any traffic coming from
- the target, so any reponses are sent off into oblivion.)
-
- --[ Why it Works ]--
-
- IP-Spoofing works because trusted services only rely on
- network address based authentication. Since IP is easily duped,
- address forgery is not difficult. The hardest part of the attck is
- in the sequence number prediction, because that is where the guesswork
- comes into play. Reduce unknowns and guesswork to a minimum, and
- the attack has a better chance of suceeding. Even a machine that
- wraps all it's incoming TCP bound connections with Wietse Venema's TCP
- wrappers, is still vulnerable to the attack. TCP wrappers rely on a
- hostname or an IP address for authentication...
-
- [SECTION III. PREVENTITIVE MEASURES]
-
- ...A stich in time, saves nine...
-
- --[ Be Un-trusting and Un-trustworthy ]--
-
- One easy solution to prevent this attack is not to rely
- on address-based authentication. Disable all the r* commands,
- remove all .rhosts files and empty out the /etc/hosts.equiv file.
- This will force all users to use other means of remote access
- (telnet, ssh, skey, etc).
-
- --[ Packet Filtering ]--
-
- If your site has a direct connect to the Internet, you
- can use your router to help you out. First make sure only hosts
- on your internal LAN can particpate in trust-relationships (no
- internal host should trust a host outside the LAN). Then simply
- filter out *all* traffic from the outside (the Internet) that
- puports to come from the inside (the LAN).
-
- --[ Cryptographic Methods ]--
-
- An obvious method to deter IP-spoofing is to require
- all network traffic to be encrypted and/or authenticated. While
- several solutions exist, it will be a while before such measures are
- deployed as defacto standards.
-
- --[ Initial Sequence Number Randomizing ]--
-
- Since the sequence numbers are not choosen randomly (or
- incremented randomly) this attack works. Bellovin describes a
- fix for TCP that involves partitioning the sequence number space.
- Each connection would have it's own seperate sequence number space.
- The sequence numbers would still be incremented as before, however,
- there would be no obvious or implied relationship between the
- numbering in these spaces. Suggested is the following formula:
-
- ISN=M+F(localhost,localport,remotehost,remoteport)
-
- Where M is the 4 microsecond timer and F is a cryptographic hash.
- F must not be computable from the outside or the attacker could
- still guess sequence numbers. Bellovin suggests F be a hash of
- the connection-id and a secret vector (a random number, or a host
- related secret combined with the machine's boot time).
-
- [SECTION IV. SOURCES]
-
- -Books: TCP/IP Illustrated vols. I, II & III
- -RFCs: 793, 1825, 1948
- -People: Richard W. Stevens, and the users of the
- Information Nexus for proofreading
- -Sourcecode: rbone, mendax, SYNflood
-
- This paper made possible by a grant from the Guild Corporation.
-