home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.ee.lbl.gov
/
2014.05.ftp.ee.lbl.gov.tar
/
ftp.ee.lbl.gov
/
papers
/
ecn.txt
< prev
next >
Wrap
Text File
|
1997-11-18
|
27KB
|
619 lines
Internet Engineering Task Force K. K. Ramakrishnan
INTERNET DRAFT AT&T Labs Research
draft-kksjf-ecn-00.txt Sally Floyd
LBNL
November 1997
Expires: May 1998
A Proposal to add Explicit Congestion Notification (ECN) to IPv6 and to TCP
Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
To view the entire list of current Internet-Drafts, please check the
"1id-abstracts.txt" listing contained in the Internet-Drafts Shadow
Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe),
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
ftp.isi.edu (US West Coast).
Abstract
This note describes a proposed addition of ECN (Explicit Congestion
Notification) to IPv6 and to TCP. First we describe TCP's use of
packet drops as an indication of congestion. Next we argue that with
the addition of active queue management (e.g., RED) to the Internet
infrastructure, where routers detect congestion before the queue
overflows, routers are no longer limited to packet drops as an
indication of congestion, but could instead set an ECN bit in the
packet header, for ECN-capable transport protocols. We describe when
the ECN bit would be set in the routers, and describe what
modifications would be needed to TCP to make it ECN-capable.
Modifications to other transport protocols (e.g., unreliable unicast
or multicast, reliable multicast, other reliable unicast transport
protocols) could be considered as those protocols advance through the
standards process.
Ramakrishnan and Floyd Informational [Page 1]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
TCP's congestion control and avoidance algorithms are based on the
notion that the network is a black-box [Jacobson88, Jacobson90]. The
network's state of congestion or otherwise is determined by end-
systems probing for the network state, by gradually increasing the
load on the network (by increasing the window of packets that are
outstanding in the network) until the network becomes congested and a
packet is lost. Treating the network as a "black-box" and treating
loss as an indication of congestion in the network is appropriate for
pure best-effort data carried by TCP that has little or no
sensitivity to delay or loss of individual packets. In addition,
TCP's congestion management algorithms have techniques built-in (such
as fast retransmit and fast recovery) to minimize the impact of
losses from a throughput perspective.
However, these mechanisms are not intended to help applications that
are in fact sensitive to the delay or loss of one or more individual
packets. Interactive traffic such as telnet, web-browsing, and
transfer of audio and video data ("real-audio" and "real-video") can
be sensitive to packet losses (for unreliable data delivery such as
UDP) or to the increased latency of the packet caused by the need to
retransmit the packet after a loss (for reliable data delivery such
as TCP).
Since TCP determines the appropriate congestion window to use by
gradually increasing the window size until it experiences a dropped
packet, this causes the queues at the bottleneck router to build up.
With most packet drop policies at the router that are not sensitive
to the load placed by each individual flow, this means that some of
the packets of latency-sensitive flows are going to be dropped.
Active queue management mechanisms that detect congestion before the
queue overflows, and provide an indication of this congestion to TCP,
is desirable because it avoids some bad properties of dropping on
queue overflow, especially with drop-tail schemes. Drop tail
introduces synchronization of loss across multiple flows which is
undesirable. Indicating incipient congestion means that TCP does not
have to increase its window size up to the point where a router's
buffer is filled up. This can reduce queuing delays and avoid
synchronization, which are desirable characteristics.
2. Random Early Detection (RED)
Random Early Detection (RED) is a mechanism for active queue
management that has been proposed to detect incipient congestion
[FJ93], and is currently being deployed in the Internet backbone
[RED-ietf-draft]. Although RED is meant to be a general mechanism
using one of several alternatives for congestion indication, in the
current environment of the Internet RED is restricted to using packet
drops as a mechanism for congestion indication. By dropping packets
Ramakrishnan and Floyd Informational [Page 2]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
based on the average queue length exceeding a threshold, rather than
only when the queue overflows, RED maintains the average queue at a
smaller level, and improves the delay experienced by the flows.
However, when RED drops packets before the queue actually overflows,
RED is not forced by memory limitations to discard the packet. RED
could set an Explicit Congestion Notification bit in the packet
header instead of dropping the packet, if such a bit was provided in
the IP header and understood by the transport protocol. The use of
the Explicit Congestion Notification bit would allow the receiver(s)
to receive the packet, avoiding the potential for excessive delays
due to retransmissions after packet losses.
3. Explicit Congestion Notification
We propose that the Internet provide a congestion indication for
incipient congestion (as in RED and earlier work [RJ90]) where the
notification can sometimes be through marking packets rather than
dropping them. This would require an ECN field in the IP header with
two bits. The ECN-Capable bit would be set by the data sender to
indicate an ECN-capable transport protocol. The ECN bit would be set
by the router to indicate congestion to the end nodes. ([Floyd94]
outlines a scheme where a single bit could be overloaded to serve the
function of both the ECN-Capable bit and the ECN bit, but the two-bit
scheme is more straightforward to explain). We expect that routers
would provide the congestion indication on incipient congestion as
indicated by the average queue size, using the RED algorithms
suggested in [FJ93, RED-ietf-draft]. Routers that have a packet
arriving at a full queue would drop the packet, just as they do now.
The congestion control algorithms followed at the end-systems would
be essentially the same as the congestion control response to a
*single* dropped packet, for a transport protocol where a dropped
packet is used as an indication of congestion. For TCP in
particular, the source TCP would halve its congestion window "cwnd"
in response to an ECN indication received by the data receiver.
However, this action is done only once per window of data (i.e., at
most once per roundtrip time), to avoid reacting multiple times to
multiple indications of congestion within a roundtrip time.
4. Proposed Algorithm at the Router
We describe the proposed algorithm at the router in the context of
current router implementations. We assume that the router is capable
of implementing the probability computation for RED and uses a pure
packet drop mechanism (e.g., drop from front, drop from tail, or
random drop) whenever a packet arrives at a full queue.
When the router's buffer is not yet full and the router is prepared
Ramakrishnan and Floyd Informational [Page 3]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
to drop a packet to inform end nodes of incipient congestion, the
router should first check to see if the ECN-Capable bit is set in
that packet's IP header. If so, then instead of dropping the packet,
the router could instead set the ECN bit in the IP header. When more
severe congestion has occurred and the router's queue is full, then
the router has no choice but to drop some packet when a new packet
arrives.
The router determines it is congested if the AVERAGE length of any of
its queues where packets are waiting to be processed or transmitted
exceeds a threshold. We believe that the router should use the ECN
bit to notify that it is congested only when the *average* queue
length, rather than the instantaneous queue length, exceeds a
threshold.
There are potentially several alternatives for estimating the average
queue length and marking the ECN bit. Since there is considerable
effort involved already in implementing RED, we believe it is best to
leverage these efforts for ECN as well. One potential mechanism for
the averaging and marking is to perform functions similar to RED
queue management: RED uses an exponential moving average of the queue
size. When the average queue size goes above a lower threshold,
packets are marked with a probability of marking that increases with
the average queue size. (Packets that are not ECN-capable are
dropped instead of marked.) When the average queue size gets up to or
above a high threshold, all incoming packets should be dropped
(assuming that the router intends to control the average queue size
even in the presence of unresponsive traffic).
It is anticipated that when all of the source end-systems participate
in TCP's congestion management mechanisms or other compatible
congestion control, and respond to ECN by reducing their offered
load, packet losses would be relatively infrequent. Packet losses in
this case would occur primarily during transients and in the presence
of non-cooperating entities.
When a packet is received by a router with the ECN bit set indicating
that congestion was encountered upstream, then the bit is left
unchanged, and the packet transmitted as usual.
5. Support from the Transport Protocol
ECN requires support from the transport protocol, in addition to the
ECN field in the IPv6 packet header. For TCP, ECN requires two new
mechanisms: negotiation between the endpoints during setup to
determine if they are both ECN-capable, and an ECN-Notify bit in the
TCP header so that the data receiver can inform the data sender when
a packet has been received with the ECN bit set. The support
Ramakrishnan and Floyd Informational [Page 4]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
required from other transport protocols is likely to be different,
particular for unreliable or reliable multicast transport protocols,
and will have to be determined as other transport protocols are
brought to the IETF for standardization. The following sections
describe in detail the proposed TCP use of ECN. This is also
described in [Floyd94]. We assume that the source TCP uses the
current set of congestion control algorithms of Slow-start, Fast
Retransmit and Fast Recovery [RFC 2001].
5.1. TCP Initialization
Initially, the source and destination TCPs exchange the desire and/or
capability to use ECN in the TCP connection setup phase. As a result
of the negotiation, the TCP sender indicates using the ECN-Capable
bit in the IPv6 header that the transport is capable and willing to
participate in ECN. This will indicate to the routers that they may
mark packets with the ECN bit, if they would like to use that as a
method of congestion notification. If the TCP connection does not
wish to use ECN notification, the sending TCP sets the ECN-Capable
bit equal to 0 (i.e., not set), and the TCP receiver ignores the ECN
bit in received packets.
5.2. The TCP Sender
For a connection that expects to use ECN, packets are transmitted
with the ECN-Capable bit set in the IP header (set to a "1"). If the
sender receives a TCP acknowledgement with the ECN-Notify bit set in
the TCP header, then the sender knows that congestion was encountered
in the network on the path from the sender to the receiver. The
indication of congestion should be treated just as a congestion loss
in non-ECN-Capable TCP. That is, the TCP source halves the congestion
window "cwnd" and reduces the slow start threshold "ssthresh". The
sending TCP does NOT increase the congestion window in response to
the receipt of an ACK packet with the ECN-Notify bit set. However, a
very important difference is that TCP does not react to ECN
congestion indications more than once every window of data (or more
loosely, more than once every round-trip time). If a response to the
ECN-Notify bit was made over the last round-trip time, based on the
window of packets, then the sending TCP doesn't respond to any
further ECN messages. If at time "t", the source TCP reacted to an
ECN, then it notes the packets that are outstanding at that time and
have not yet been acknowledged. Until all these packets are
acknowledged, say at time "u", the source TCP does not react to
another ECN indication of congestion.
In addition, when a TCP sender receives duplicate acks during the
time interval between "t" and "u", it does not reduce the congestion
window. The result is that decreases in the congestion window occur
Ramakrishnan and Floyd Informational [Page 5]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
at most once per roundtrip time.
When the TCP sender receives a packet with the ECN-Notify bit set,
and therefore reduces its congestion window, the sender does not need
to slow-start (as is done in Tahoe TCP in response to a packet drop)
or to stop sending packets for a period of time to allow the queue to
dissipate (as is done by Reno TCP for roughly half a round-trip time
in response to a packet drop). The ECN-Notify bit being set does not
indicate the urgent transient congestion state of a buffer overflow.
Incoming acknowledgements will still arrive to "clock out" outgoing
packets when allowed by the congestion window.
TCP follows existing algorithms for sending data packets in response
to incoming ACKs, multiple duplicate acknowledgements, or retransmit
timeouts [RFC2001].
5.3. The TCP Receiver
At the destination end-system, when TCP receives a packet with the
ECN bit set in the IP header, TCP sets the ECN-Notify bit in the TCP
header in the returning ACK packet. We do not provide here any
notion of destination congestion, because this is already being
indicated in the receiver's advertised window.
The destination TCP continues to perform the duplicate ACK procedure
already specified - to generate a duplicate ACK when an out-of-
sequence packet is received.
If there is any ACK withholding implemented, as in current TCP
implementations where the TCP receiver often sends an ACK for two
arriving data packets, then the TCP destination will send the OR of
all the ECN bits of packets that the ACK is acknowledging. That is,
if any packet is received with the ECN bit set, then the ACK carries
the ECN-Notify bit set.
5.4. Congestion on the ACK-path
For the current generation of TCP congestion control algorithms, pure
acknowledgement packets (e.g., packets that do not contain any
accompanying data) should be sent with the ECN-capable bit off.
Current TCP receivers have no mechanisms for reducing traffic on the
ACK-path in response to congestion notification. Mechanisms for
responding to congestion on the ACK-path can be relegated as an area
for future research. (One simple possibility would be for the sender
to reduce its congestion window when it receives a pure ACK packet
with the ECN bit set). For current TCP implementations, a single
dropped ACK generally has only a very small effect on the TCP's
sending rate.
Ramakrishnan and Floyd Informational [Page 6]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
6. Summary of changes required in IPv6 and TCP
Two bits need to be specified in the IPv6 header, the ECN-Capable bit
and the ECN bit. The ECN-Capable bit set to "0" indicates that the
transport protocol will ignore the ECN bit. This is the default
value. The ECN-Capable bit set to "1" indicates that the transport
protocol is willing and able to participate in ECN.
The default value for the ECN bit is "0". The router sets the ECN
bit to "1" to indicate congestion to the end nodes. The ECN bit in a
packet header should never be reset by a router from "1" to "0".
TCP requires two changes, a negotiation phase during setup to
determine if both end nodes are ECN-capable, and a bit in the TCP
header (possibly one of the "reserved" bits in the TCP flags field)
as an ECN-Notify bit so that the receiver can inform the sender of a
packet received with the ECN bit set.
7. Non-relationship to ATM's EFCI indicator or Frame Relay's FECN
Since these ATM and Frame Relay mechanisms typically have been
defined without any notion of average queue size as the basis for
concluding that there is congestion, we believe that they provide a
very noisy signal. The interpretation we have here for ECN is NOT the
appropriate reaction for such a noisy signal of congestion
notification. It is our belief that such mechanisms would be phased
out over time within the ATM network. However, if the routers that
interface to the ATM network have a way of maintaining the average
queue at the interface, and use it to come to a conclusion that the
ATM subnet is congested or otherwise, they may use the ECN
notification that is defined here.
8. Non-compliance by the End Nodes
We believe that, for the most part, the fairness properties of TCP
will not be changed with the introduction of ECN.
A key issue concerns the vulnerability of ECN to non-compliant end-
nodes (i.e., end nodes that set the ECN-capable bit in packets, but
do not respond to the ECN bit itself). These concerns exist even in
non-ECN environments. An end-node could "turn off congestion
control" by not reducing its congestion window in response to packet
drops. We recognize that this is a concern for the current Internet.
It has been argued that routers will have to deploy mechanisms to
detect and differentially treat packets from non-compliant flows. It
is likely that techniques such as end-to-end per-flow scheduling and
isolation of one flow from another, potentially accompanied by end-
to-end reservations, could mitigate such effects. Such isolation
Ramakrishnan and Floyd Informational [Page 7]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
mechanisms could remove some of the more egregious effects of non-
compliance.
However, even in networks just restricted to packet losses as an
indication of congestion, several methods have been proposed to
identify and treat non-compliant or unresponsive flows. These
mechanisms would be equally applicable for identifying flows that do
not respond to ECN. If anything, routers would have a slightly
easier time identifying flows that do not respond to ECN. For
example, routers can observe packets arriving at the router with the
ECN bit set, as well as keeping note of packets that have the ECN bit
set at that router itself.
It has been argued that dropping packets in itself may be considered
a deterrrent for non-compliance. However, we believe that the packet
drop rates are likely to be reasonably low in environments where ECN
is deployed. The reduction in load due to packet drops to deal with
non-compliant nodes is likely to be small. The control of congestion
is more likely to come from end-nodes reacting to congestion - either
from responding to dropped packets or ECN Notify indications and
halving the window. ECN should be used at a router when the average
queue size is below some high threshold; when the average queue size
exceeds the high threshold, and therefore packet drop/marking rates
are higher, our recommendation is that routers drop packets, rather
then setting the ECN bit in packet headers. Thus, in scenarios with
low packet drop rates, the fact that the congestion control
indications are in the form of packet drops rather than ECN bits does
not significantly change the negative consequences on the compliant
flows because of some flow "turning off" congestion control.
We also do not believe that packet dropping itself is an effective
deterrent for non-compliance. Many flows that retransmit dropped
packets could have an incentive to maintain or even increase their
sending rate in response to packet drops, rather than decreasing
their sending rate, in the absence of mechanisms at the router to
provide a negative deterrance for such behavior. For example, flows
that use unreliable transport protocols could simply increase their
use of FEC in response to an increased packet drop rate, and might
choose increased FEC and no congestion control. We believe that the
effect of packet dropping as a deterrence for non-compliance with
congestion control mechanisms is quite small. The possibility of
non-compliant flows does not offer a compelling reason not to deploy
ECN.
9. Additional Considerations
Some care is required to handle the ECN and ECN-Capable bits
appropriately when packets are encapsulated and un-encapsulated for
Ramakrishnan and Floyd Informational [Page 8]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
tunnels. When the router at the end of the tunnel decapsulates the
packet, then the ECN bit in the encapsulating ('outside') header
should be ORed with the ECN bit in the encapsulated ('inside') header
that remains. Basically, a 1 in the encapsulating header should be
copied into the encapsulated header.
An additional issue concerns packets that have the ECN bit set at one
router, and are later dropped at another router. For the proposed
use for ECN in this paper (that is, for data packets for TCP), this
is not a concern, because end nodes detect dropped data packets, and
the congestion response of the end nodes to a dropped data packet is
at least as strong as the congestion response to a packet received
with the ECN bit set. This issue will have to be addressed if ECN
and ECN-Capable bits are used on pure ACK packets, because in current
implementations of TCP the drop of an ACK packet is not explicitly
detected by the end nodes.
If a packet with the ECN bit is later dropped due to corruption (bit
errors), the end node should still invoke congestion control, just as
TCP would today, to a dropped data packet. This issue would also
have to be addressed in future proposals for distinguishing between
packets dropped due to corruption and packets dropped due to
congestion.
10. Conclusions
Given the current effort to implement RED, we believe this is the
right time for router vendors to examine how to also implement
congestion avoidance mechanisms that do not depend on packet drops
alone. With the growth of applications and transports that are
sensitive to delay and loss of a single packet, depending on packet
loss as a normal congestion notification mechanism appears to be
insufficient (or at the very least, non-optimal).
Ramakrishnan and Floyd Informational [Page 9]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
REFERENCES
[FJ93] Floyd, S., and Jacobson, V., "Random Early Detection gateways
for Congestion Avoidance", IEEE/ACM Transactions on Networking, V.1
N.4, August 1993, p. 397-413. URL
"ftp://ftp.ee.lbl.gov/papers/early.pdf".
[Floyd94] Floyd, S., "TCP and Explicit Congestion Notification", ACM
Computer Communication Review, V. 24 N. 5, October 1994, p. 10-23.
URL "ftp://ftp.ee.lbl.gov/papers/tcp_ecn.4.ps.Z".
[Floyd97] Floyd, S., and Fall, K., "Router Mechanisms to Support
End-to-End Congestion Control", Technical report, February 1997. URL
"ftp://ftp.ee.lbl.gov/papers/collapse.ps".
[FRED] Lin, D., and Morris, R., "Dynamics of Random Early Detection",
SIGCOMM '97, September 1997. URL
"http://www.inria.fr/rodeo/sigcomm97/program.html#ab078".
[Jacobson88] V. Jacobson, "Congestion Avoidance and Control", Proc.
ACM SIGCOMM '88, pp. 314-329. URL
"ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z".
[Jacobson90] V. Jacobson, "Modified TCP Congestion Avoidance
Algorithm", Message to end2end-interest mailing list, April 1990.
URL "ftp://ftp.ee.lbl.gov/email/vanj.90apr30.txt".
[RED-ietf-draft] B. Braden, D. Clark, J. Crowcroft, B. Davie, S.
Deering, D. Estrin, S. Floyd, V. Jacobson, G. Minshall, C. Partridge,
L. Peterson, K. Ramakrishnan, S. Shenker, J. Wroclawski, L. Zhang,
"Recommendations on Queue Management and Congestion Avoidance in the
Internet", Internet draft draft-irtf-e2e-queue-mgt-00.txt, March 25,
1997.
[RFC2001] W. Stevens, "TCP Slow Start, Congestion Avoidance, Fast
Retransmit, and Fast Recovery Algorithms", RFC 2001, January 1997.
[RJ90] K. K. Ramakrishnan and Raj Jain, "A Binary Feedback Scheme for
Congestion Avoidance in Computer Networks", ACM Transactions on
Computer Systems, Vol.8, No.2, pp. 158-181, May 1990.
SECURITY CONSIDERATIONS
Security issues are not discussed in this document.
Ramakrishnan and Floyd Informational [Page 10]
draft-kksjf-ecn Addition of ECN to IPv6 and TCP November 1997
AUTHORS' ADDRESSES
K. K. Ramakrishnan
AT&T Labs. Research
Phone: +1 (973) 360-8766
Email: kkrama@research.att.com
URL: http://www.research.att.com/info/kkrama
Sally Floyd
Lawrence Berkeley National Laboratory
Phone: +1 (510) 486-7518
Email: floyd@ee.lbl.gov
URL: http://www-nrg.ee.lbl.gov/floyd/
This draft was created in November 1997.
It expires May 1998.
Ramakrishnan and Floyd Informational [Page 11]