home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
-
-
- Network Working Group M. Mathis
- Request for Comments: 2018 J. Mahdavi
- Category: Standards Track PSC
- S. Floyd
- LBNL
- A. Romanow
- Sun Microsystems
- October 1996
-
-
- TCP Selective Acknowledgment Options
-
- Status of this Memo
-
- This document specifies an Internet standards track protocol for the
- Internet community, and requests discussion and suggestions for
- improvements. Please refer to the current edition of the "Internet
- Official Protocol Standards" (STD 1) for the standardization state
- and status of this protocol. Distribution of this memo is unlimited.
-
- Abstract
-
- TCP may experience poor performance when multiple packets are lost
- from one window of data. With the limited information available
- from cumulative acknowledgments, a TCP sender can only learn about a
- single lost packet per round trip time. An aggressive sender could
- choose to retransmit packets early, but such retransmitted segments
- may have already been successfully received.
-
- A Selective Acknowledgment (SACK) mechanism, combined with a
- selective repeat retransmission policy, can help to overcome these
- limitations. The receiving TCP sends back SACK packets to the sender
- informing the sender of data that has been received. The sender can
- then retransmit only the missing data segments.
-
- This memo proposes an implementation of SACK and discusses its
- performance and related issues.
-
- Acknowledgements
-
- Much of the text in this document is taken directly from RFC1072 "TCP
- Extensions for Long-Delay Paths" by Bob Braden and Van Jacobson. The
- authors would like to thank Kevin Fall (LBNL), Christian Huitema
- (INRIA), Van Jacobson (LBNL), Greg Miller (MITRE), Greg Minshall
- (Ipsilon), Lixia Zhang (XEROX PARC and UCLA), Dave Borman (BSDI),
- Allison Mankin (ISI) and others for their review and constructive
- comments.
-
-
-
-
- Mathis, et. al. Standards Track [Page 1]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- 1. Introduction
-
- Multiple packet losses from a window of data can have a catastrophic
- effect on TCP throughput. TCP [Postel81] uses a cumulative
- acknowledgment scheme in which received segments that are not at the
- left edge of the receive window are not acknowledged. This forces
- the sender to either wait a roundtrip time to find out about each
- lost packet, or to unnecessarily retransmit segments which have been
- correctly received [Fall95]. With the cumulative acknowledgment
- scheme, multiple dropped segments generally cause TCP to lose its
- ACK-based clock, reducing overall throughput.
-
- Selective Acknowledgment (SACK) is a strategy which corrects this
- behavior in the face of multiple dropped segments. With selective
- acknowledgments, the data receiver can inform the sender about all
- segments that have arrived successfully, so the sender need
- retransmit only the segments that have actually been lost.
-
- Several transport protocols, including NETBLT [Clark87], XTP
- [Strayer92], RDP [Velten84], NADIR [Huitema81], and VMTP [Cheriton88]
- have used selective acknowledgment. There is some empirical evidence
- in favor of selective acknowledgments -- simple experiments with RDP
- have shown that disabling the selective acknowledgment facility
- greatly increases the number of retransmitted segments over a lossy,
- high-delay Internet path [Partridge87]. A recent simulation study by
- Kevin Fall and Sally Floyd [Fall95], demonstrates the strength of TCP
- with SACK over the non-SACK Tahoe and Reno TCP implementations.
-
- RFC1072 [VJ88] describes one possible implementation of SACK options
- for TCP. Unfortunately, it has never been deployed in the Internet,
- as there was disagreement about how SACK options should be used in
- conjunction with the TCP window shift option (initially described
- RFC1072 and revised in [Jacobson92]).
-
- We propose slight modifications to the SACK options as proposed in
- RFC1072. Specifically, sending a selective acknowledgment for the
- most recently received data reduces the need for long SACK options
- [Keshav94, Mathis95]. In addition, the SACK option now carries full
- 32 bit sequence numbers. These two modifications represent the only
- changes to the proposal in RFC1072. They make SACK easier to
- implement and address concerns about robustness.
-
- The selective acknowledgment extension uses two TCP options. The
- first is an enabling option, "SACK-permitted", which may be sent in a
- SYN segment to indicate that the SACK option can be used once the
- connection is established. The other is the SACK option itself,
- which may be sent over an established connection once permission has
- been given by SACK-permitted.
-
-
-
- Mathis, et. al. Standards Track [Page 2]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- The SACK option is to be included in a segment sent from a TCP that
- is receiving data to the TCP that is sending that data; we will refer
- to these TCP's as the data receiver and the data sender,
- respectively. We will consider a particular simplex data flow; any
- data flowing in the reverse direction over the same connection can be
- treated independently.
-
- 2. Sack-Permitted Option
-
- This two-byte option may be sent in a SYN by a TCP that has been
- extended to receive (and presumably process) the SACK option once the
- connection has opened. It MUST NOT be sent on non-SYN segments.
-
- TCP Sack-Permitted Option:
-
- Kind: 4
-
- +---------+---------+
- | Kind=4 | Length=2|
- +---------+---------+
-
- 3. Sack Option Format
-
- The SACK option is to be used to convey extended acknowledgment
- information from the receiver to the sender over an established TCP
- connection.
-
- TCP SACK Option:
-
- Kind: 5
-
- Length: Variable
-
- +--------+--------+
- | Kind=5 | Length |
- +--------+--------+--------+--------+
- | Left Edge of 1st Block |
- +--------+--------+--------+--------+
- | Right Edge of 1st Block |
- +--------+--------+--------+--------+
- | |
- / . . . /
- | |
- +--------+--------+--------+--------+
- | Left Edge of nth Block |
- +--------+--------+--------+--------+
- | Right Edge of nth Block |
- +--------+--------+--------+--------+
-
-
-
- Mathis, et. al. Standards Track [Page 3]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- The SACK option is to be sent by a data receiver to inform the data
- sender of non-contiguous blocks of data that have been received and
- queued. The data receiver awaits the receipt of data (perhaps by
- means of retransmissions) to fill the gaps in sequence space between
- received blocks. When missing segments are received, the data
- receiver acknowledges the data normally by advancing the left window
- edge in the Acknowledgement Number Field of the TCP header. The SACK
- option does not change the meaning of the Acknowledgement Number
- field.
-
- This option contains a list of some of the blocks of contiguous
- sequence space occupied by data that has been received and queued
- within the window.
-
- Each contiguous block of data queued at the data receiver is defined
- in the SACK option by two 32-bit unsigned integers in network byte
- order:
-
- * Left Edge of Block
-
- This is the first sequence number of this block.
-
- * Right Edge of Block
-
- This is the sequence number immediately following the last
- sequence number of this block.
-
- Each block represents received bytes of data that are contiguous and
- isolated; that is, the bytes just below the block, (Left Edge of
- Block - 1), and just above the block, (Right Edge of Block), have not
- been received.
-
- A SACK option that specifies n blocks will have a length of 8*n+2
- bytes, so the 40 bytes available for TCP options can specify a
- maximum of 4 blocks. It is expected that SACK will often be used in
- conjunction with the Timestamp option used for RTTM [Jacobson92],
- which takes an additional 10 bytes (plus two bytes of padding); thus
- a maximum of 3 SACK blocks will be allowed in this case.
-
- The SACK option is advisory, in that, while it notifies the data
- sender that the data receiver has received the indicated segments,
- the data receiver is permitted to later discard data which have been
- reported in a SACK option. A discussion appears below in Section 8
- of the consequences of advisory SACK, in particular that the data
- receiver may renege, or drop already SACKed data.
-
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 4]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- 4. Generating Sack Options: Data Receiver Behavior
-
- If the data receiver has received a SACK-Permitted option on the SYN
- for this connection, the data receiver MAY elect to generate SACK
- options as described below. If the data receiver generates SACK
- options under any circumstance, it SHOULD generate them under all
- permitted circumstances. If the data receiver has not received a
- SACK-Permitted option for a given connection, it MUST NOT send SACK
- options on that connection.
-
- If sent at all, SACK options SHOULD be included in all ACKs which do
- not ACK the highest sequence number in the data receiver's queue. In
- this situation the network has lost or mis-ordered data, such that
- the receiver holds non-contiguous data in its queue. RFC 1122,
- Section 4.2.2.21, discusses the reasons for the receiver to send ACKs
- in response to additional segments received in this state. The
- receiver SHOULD send an ACK for every valid segment that arrives
- containing new data, and each of these "duplicate" ACKs SHOULD bear a
- SACK option.
-
- If the data receiver chooses to send a SACK option, the following
- rules apply:
-
- * The first SACK block (i.e., the one immediately following the
- kind and length fields in the option) MUST specify the contiguous
- block of data containing the segment which triggered this ACK,
- unless that segment advanced the Acknowledgment Number field in
- the header. This assures that the ACK with the SACK option
- reflects the most recent change in the data receiver's buffer
- queue.
-
- * The data receiver SHOULD include as many distinct SACK blocks as
- possible in the SACK option. Note that the maximum available
- option space may not be sufficient to report all blocks present in
- the receiver's queue.
-
- * The SACK option SHOULD be filled out by repeating the most
- recently reported SACK blocks (based on first SACK blocks in
- previous SACK options) that are not subsets of a SACK block
- already included in the SACK option being constructed. This
- assures that in normal operation, any segment remaining part of a
- non-contiguous block of data held by the data receiver is reported
- in at least three successive SACK options, even for large-window
- TCP implementations [RFC1323]). After the first SACK block, the
- following SACK blocks in the SACK option may be listed in
- arbitrary order.
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 5]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- It is very important that the SACK option always reports the block
- containing the most recently received segment, because this provides
- the sender with the most up-to-date information about the state of
- the network and the data receiver's queue.
-
- 5. Interpreting the Sack Option and Retransmission Strategy: Data
- Sender Behavior
-
- When receiving an ACK containing a SACK option, the data sender
- SHOULD record the selective acknowledgment for future reference. The
- data sender is assumed to have a retransmission queue that contains
- the segments that have been transmitted but not yet acknowledged, in
- sequence-number order. If the data sender performs re-packetization
- before retransmission, the block boundaries in a SACK option that it
- receives may not fall on boundaries of segments in the retransmission
- queue; however, this does not pose a serious difficulty for the
- sender.
-
- One possible implementation of the sender's behavior is as follows.
- Let us suppose that for each segment in the retransmission queue
- there is a (new) flag bit "SACKed", to be used to indicate that this
- particular segment has been reported in a SACK option.
-
- When an acknowledgment segment arrives containing a SACK option, the
- data sender will turn on the SACKed bits for segments that have been
- selectively acknowledged. More specifically, for each block in the
- SACK option, the data sender will turn on the SACKed flags for all
- segments in the retransmission queue that are wholly contained within
- that block. This requires straightforward sequence number
- comparisons.
-
- After the SACKed bit is turned on (as the result of processing a
- received SACK option), the data sender will skip that segment during
- any later retransmission. Any segment that has the SACKed bit turned
- off and is less than the highest SACKed segment is available for
- retransmission.
-
- After a retransmit timeout the data sender SHOULD turn off all of the
- SACKed bits, since the timeout might indicate that the data receiver
- has reneged. The data sender MUST retransmit the segment at the left
- edge of the window after a retransmit timeout, whether or not the
- SACKed bit is on for that segment. A segment will not be dequeued
- and its buffer freed until the left window edge is advanced over it.
-
-
-
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 6]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- 5.1 Congestion Control Issues
-
- This document does not attempt to specify in detail the congestion
- control algorithms for implementations of TCP with SACK. However,
- the congestion control algorithms present in the de facto standard
- TCP implementations MUST be preserved [Stevens94]. In particular, to
- preserve robustness in the presence of packets reordered by the
- network, recovery is not triggered by a single ACK reporting out-of-
- order packets at the receiver. Further, during recovery the data
- sender limits the number of segments sent in response to each ACK.
- Existing implementations limit the data sender to sending one segment
- during Reno-style fast recovery, or to two segments during slow-start
- [Jacobson88]. Other aspects of congestion control, such as reducing
- the congestion window in response to congestion, must similarly be
- preserved.
-
- The use of time-outs as a fall-back mechanism for detecting dropped
- packets is unchanged by the SACK option. Because the data receiver
- is allowed to discard SACKed data, when a retransmit timeout occurs
- the data sender MUST ignore prior SACK information in determining
- which data to retransmit.
-
- Future research into congestion control algorithms may take advantage
- of the additional information provided by SACK. One such area for
- future research concerns modifications to TCP for a wireless or
- satellite environment where packet loss is not necessarily an
- indication of congestion.
-
- 6. Efficiency and Worst Case Behavior
-
- If the return path carrying ACKs and SACK options were lossless, one
- block per SACK option packet would always be sufficient. Every
- segment arriving while the data receiver holds discontinuous data
- would cause the data receiver to send an ACK with a SACK option
- containing the one altered block in the receiver's queue. The data
- sender is thus able to construct a precise replica of the receiver's
- queue by taking the union of all the first SACK blocks.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 7]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- Since the return path is not lossless, the SACK option is defined to
- include more than one SACK block in a single packet. The redundant
- blocks in the SACK option packet increase the robustness of SACK
- delivery in the presence of lost ACKs. For a receiver that is also
- using the time stamp option [Jacobson92], the SACK option has room to
- include three SACK blocks. Thus each SACK block will generally be
- repeated at least three times, if necessary, once in each of three
- successive ACK packets. However, if all of the ACK packets reporting
- a particular SACK block are dropped, then the sender might assume
- that the data in that SACK block has not been received, and
- unnecessarily retransmit those segments.
-
- The deployment of other TCP options may reduce the number of
- available SACK blocks to 2 or even to 1. This will reduce the
- redundancy of SACK delivery in the presence of lost ACKs. Even so,
- the exposure of TCP SACK in regard to the unnecessary retransmission
- of packets is strictly less than the exposure of current
- implementations of TCP. The worst-case conditions necessary for the
- sender to needlessly retransmit data is discussed in more detail in a
- separate document [Floyd96].
-
- Older TCP implementations which do not have the SACK option will not
- be unfairly disadvantaged when competing against SACK-capable TCPs.
- This issue is discussed in more detail in [Floyd96].
-
- 7. Sack Option Examples
-
- The following examples attempt to demonstrate the proper behavior of
- SACK generation by the data receiver.
-
- Assume the left window edge is 5000 and that the data transmitter
- sends a burst of 8 segments, each containing 500 data bytes.
-
- Case 1: The first 4 segments are received but the last 4 are
- dropped.
-
- The data receiver will return a normal TCP ACK segment
- acknowledging sequence number 7000, with no SACK option.
-
-
-
-
-
-
-
-
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 8]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- Case 2: The first segment is dropped but the remaining 7 are
- received.
-
- Upon receiving each of the last seven packets, the data
- receiver will return a TCP ACK segment that acknowledges
- sequence number 5000 and contains a SACK option specifying
- one block of queued data:
-
- Triggering ACK Left Edge Right Edge
- Segment
-
- 5000 (lost)
- 5500 5000 5500 6000
- 6000 5000 5500 6500
- 6500 5000 5500 7000
- 7000 5000 5500 7500
- 7500 5000 5500 8000
- 8000 5000 5500 8500
- 8500 5000 5500 9000
-
-
- Case 3: The 2nd, 4th, 6th, and 8th (last) segments are
- dropped.
-
- The data receiver ACKs the first packet normally. The
- third, fifth, and seventh packets trigger SACK options as
- follows:
-
- Triggering ACK First Block 2nd Block 3rd Block
- Segment Left Right Left Right Left Right
- Edge Edge Edge Edge Edge Edge
-
- 5000 5500
- 5500 (lost)
- 6000 5500 6000 6500
- 6500 (lost)
- 7000 5500 7000 7500 6000 6500
- 7500 (lost)
- 8000 5500 8000 8500 7000 7500 6000 6500
- 8500 (lost)
-
-
-
-
-
-
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 9]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- Suppose at this point, the 4th packet is received out of order.
- (This could either be because the data was badly misordered in the
- network, or because the 2nd packet was retransmitted and lost, and
- then the 4th packet was retransmitted). At this point the data
- receiver has only two SACK blocks to report. The data receiver
- replies with the following Selective Acknowledgment:
-
- Triggering ACK First Block 2nd Block 3rd Block
- Segment Left Right Left Right Left Right
- Edge Edge Edge Edge Edge Edge
-
- 6500 5500 6000 7500 8000 8500
-
- Suppose at this point, the 2nd segment is received. The data
- receiver then replies with the following Selective Acknowledgment:
-
- Triggering ACK First Block 2nd Block 3rd Block
- Segment Left Right Left Right Left Right
- Edge Edge Edge Edge Edge Edge
-
- 5500 7500 8000 8500
-
- 8. Data Receiver Reneging
-
- Note that the data receiver is permitted to discard data in its queue
- that has not been acknowledged to the data sender, even if the data
- has already been reported in a SACK option. Such discarding of
- SACKed packets is discouraged, but may be used if the receiver runs
- out of buffer space.
-
- The data receiver MAY elect not to keep data which it has reported in
- a SACK option. In this case, the receiver SACK generation is
- additionally qualified:
-
- * The first SACK block MUST reflect the newest segment. Even if
- the newest segment is going to be discarded and the receiver has
- already discarded adjacent segments, the first SACK block MUST
- report, at a minimum, the left and right edges of the newest
- segment.
-
- * Except for the newest segment, all SACK blocks MUST NOT report
- any old data which is no longer actually held by the receiver.
-
- Since the data receiver may later discard data reported in a SACK
- option, the sender MUST NOT discard data before it is acknowledged by
- the Acknowledgment Number field in the TCP header.
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 10]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- 9. Security Considerations
-
- This document neither strengthens nor weakens TCP's current security
- properties.
-
- 10. References
-
- [Cheriton88] Cheriton, D., "VMTP: Versatile Message Transaction
- Protocol", RFC 1045, Stanford University, February 1988.
-
- [Clark87] Clark, D., Lambert, M., and L. Zhang, "NETBLT: A Bulk Data
- Transfer Protocol", RFC 998, MIT, March 1987.
-
- [Fall95] Fall, K. and Floyd, S., "Comparisons of Tahoe, Reno, and
- Sack TCP", ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z, December 1995.
-
- [Floyd96] Floyd, S., "Issues of TCP with SACK",
- ftp://ftp.ee.lbl.gov/papers/issues_sa.ps.Z, January 1996.
-
- [Huitema81] Huitema, C., and Valet, I., An Experiment on High Speed
- File Transfer using Satellite Links, 7th Data Communication
- Symposium, Mexico, October 1981.
-
- [Jacobson88] Jacobson, V., "Congestion Avoidance and Control",
- Proceedings of SIGCOMM '88, Stanford, CA., August 1988.
-
- [Jacobson88}, Jacobson, V. and R. Braden, "TCP Extensions for Long-
- Delay Paths", RFC 1072, October 1988.
-
- [Jacobson92] Jacobson, V., Braden, R., and D. Borman, "TCP Extensions
- for High Performance", RFC 1323, May 1992.
-
- [Keshav94] Keshav, presentation to the Internet End-to-End Research
- Group, November 1994.
-
- [Mathis95] Mathis, M., and Mahdavi, J., TCP Forward Acknowledgment
- Option, presentation to the Internet End-to-End Research Group, June
- 1995.
-
- [Partridge87] Partridge, C., "Private Communication", February 1987.
-
- [Postel81] Postel, J., "Transmission Control Protocol - DARPA
- Internet Program Protocol Specification", RFC 793, DARPA, September
- 1981.
-
- [Stevens94] Stevens, W., TCP/IP Illustrated, Volume 1: The Protocols,
- Addison-Wesley, 1994.
-
-
-
-
- Mathis, et. al. Standards Track [Page 11]
-
- RFC 2018 TCP Selective Acknowledgement Options October 1996
-
-
- [Strayer92] Strayer, T., Dempsey, B., and Weaver, A., XTP -- the
- xpress transfer protocol. Addison-Wesley Publishing Company, 1992.
-
- [Velten84] Velten, D., Hinden, R., and J. Sax, "Reliable Data
- Protocol", RFC 908, BBN, July 1984.
-
- 11. Authors' Addresses
-
- Matt Mathis and Jamshid Mahdavi
- Pittsburgh Supercomputing Center
- 4400 Fifth Ave
- Pittsburgh, PA 15213
- mathis@psc.edu
- mahdavi@psc.edu
-
- Sally Floyd
- Lawrence Berkeley National Laboratory
- One Cyclotron Road
- Berkeley, CA 94720
- floyd@ee.lbl.gov
-
- Allyn Romanow
- Sun Microsystems, Inc.
- 2550 Garcia Ave., MPK17-202
- Mountain View, CA 94043
- allyn@eng.sun.com
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Mathis, et. al. Standards Track [Page 12]
-
-