home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
-
-
- Network Working Group C. Perkins
- Request for Comments: 2198 I. Kouvelas
- Category: Standards Track O. Hodson
- V. Hardman
- University College London
- M. Handley
- ISI
- J.C. Bolot
- A. Vega-Garcia
- S. Fosse-Parisis
- INRIA Sophia Antipolis
- September 1997
-
-
- RTP Payload for Redundant Audio Data
-
- Status of this Memo
-
- This document specifies an Internet standards track protocol for the
- Internet community, and requests discussion and suggestions for
- improvements. Please refer to the current edition of the "Internet
- Official Protocol Standards" (STD 1) for the standardization state
- and status of this protocol. Distribution of this memo is unlimited.
-
- Abstract
-
- This document describes a payload format for use with the real-time
- transport protocol (RTP), version 2, for encoding redundant audio
- data. The primary motivation for the scheme described herein is the
- development of audio conferencing tools for use with lossy packet
- networks such as the Internet Mbone, although this scheme is not
- limited to such applications.
-
- 1 Introduction
-
- If multimedia conferencing is to become widely used by the Internet
- Mbone community, users must perceive the quality to be sufficiently
- good for most applications. We have identified a number of problems
- which impair the quality of conferences, the most significant of
- which is packet loss. This is a persistent problem, particularly
- given the increasing popularity, and therefore increasing load, of
- the Internet. The disruption of speech intelligibility even at low
- loss rates which is currently experienced may convince a whole
- generation of users that multimedia conferencing over the Internet is
- not viable. The addition of redundancy to the data stream is offered
- as a solution [1]. If a packet is lost then the missing information
- may be reconstructed at the receiver from the redundant data that
- arrives in the following packet(s), provided that the average number
-
-
-
- Perkins, et. al. Standards Track [Page 1]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- of consecutively lost packets is small. Recent work [4,5] shows that
- packet loss patterns in the Internet are such that this scheme
- typically functions well.
-
- This document describes an RTP payload format for the transmission of
- audio data encoded in such a redundant fashion. Section 2 presents
- the requirements and motivation leading to the definition of this
- payload format, and does not form part of the payload format
- definition. Sections 3 onwards define the RTP payload format for
- redundant audio data.
-
- 2 Requirements/Motivation
-
- The requirements for a redundant encoding scheme under RTP are as
- follows:
-
- o Packets have to carry a primary encoding and one or more
- redundant encodings.
-
- o As a multitude of encodings may be used for redundant
- information, each block of redundant encoding has to have an
- encoding type identifier.
-
- o As the use of variable size encodings is desirable, each encoded
- block in the packet has to have a length indicator.
-
- o The RTP header provides a timestamp field that corresponds to
- the time of creation of the encoded data. When redundant
- encodings are used this timestamp field can refer to the time of
- creation of the primary encoding data. Redundant blocks of data
- will correspond to different time intervals than the primary
- data, and hence each block of redundant encoding will require its
- own timestamp. To reduce the number of bytes needed to carry the
- timestamp, it can be encoded as the difference of the timestamp
- for the redundant encoding and the timestamp of the primary.
-
- There are two essential means by which redundant audio may be added
- to the standard RTP specification: a header extension may hold the
- redundancy, or one, or more, additional payload types may be defined.
-
- Including all the redundancy information for a packet in a header
- extension would make it easy for applications that do not implement
- redundancy to discard it and just process the primary encoding data.
- There are, however, a number of disadvantages with this scheme:
-
-
-
-
-
-
-
- Perkins, et. al. Standards Track [Page 2]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- o There is a large overhead from the number of bytes needed for
- the extension header (4) and the possible padding that is needed
- at the end of the extension to round up to a four byte boundary
- (up to 3 bytes). For many applications this overhead is
- unacceptable.
-
- o Use of the header extension limits applications to a single
- redundant encoding, unless further structure is introduced into
- the extension. This would result in further overhead.
-
- For these reasons, the use of RTP header extension to hold redundant
- audio encodings is disregarded.
-
- The RTP profile for audio and video conferences [3] lists a set of
- payload types and provides for a dynamic range of 32 encodings that
- may be defined through a conference control protocol. This leads to
- two possible schemes for assigning additional RTP payload types for
- redundant audio applications:
-
- 1.A dynamic encoding scheme may be defined, for each combination
- of primary/redundant payload types, using the RTP dynamic payload
- type range.
-
- 2.A single fixed payload type may be defined to represent a packet
- with redundancy. This may then be assigned to either a static
- RTP payload type, or the payload type for this may be assigned
- dynamically.
-
- It is possible to define a set of payload types that signify a
- particular combination of primary and secondary encodings for each of
- the 32 dynamic payload types provided. This would be a slightly
- restrictive yet feasible solution for packets with a single block of
- redundancy as the number of possible combinations is not too large.
- However the need for multiple blocks of redundancy greatly increases
- the number of encoding combinations and makes this solution not
- viable.
-
- A modified version of the above solution could be to decide prior to
- the beginning of a conference on a set a 32 encoding combinations
- that will be used for the duration of the conference. All tools in
- the conference can be initialized with this working set of encoding
- combinations. Communication of the working set could be made through
- the use of an external, out of band, mechanism. Setup is complicated
- as great care needs to be taken in starting tools with identical
- parameters. This scheme is more efficient as only one byte is used
- to identify combinations of encodings.
-
-
-
-
-
- Perkins, et. al. Standards Track [Page 3]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- It is felt that the complication inherent in distributing the mapping
- of payload types onto combinations of redundant data preclude the use
- of this mechanism.
-
- A more flexible solution is to have a single payload type which
- signifies a packet with redundancy. That packet then becomes a
- container, encapsulating multiple payloads into a single RTP packet.
- Such a scheme is flexible, since any amount of redundancy may be
- encapsulated within a single packet. There is, however, a small
- overhead since each encapsulated payload must be preceded by a header
- indicating the type of data enclosed. This is the preferred
- solution, since it is both flexible, extensible, and has a relatively
- low overhead. The remainder of this document describes this
- solution.
-
- 3 Payload Format Specification
-
- The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
- "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
- document are to be interpreted as described in RFC2119 [7].
-
- The assignment of an RTP payload type for this new packet format is
- outside the scope of this document, and will not be specified here.
- It is expected that the RTP profile for a particular class of
- applications will assign a payload type for this encoding, or if that
- is not done then a payload type in the dynamic range shall be chosen.
-
- An RTP packet containing redundant data shall have a standard RTP
- header, with payload type indicating redundancy. The other fields of
- the RTP header relate to the primary data block of the redundant
- data.
-
- Following the RTP header are a number of additional headers, defined
- in the figure below, which specify the contents of each of the
- encodings carried by the packet. Following these additional headers
- are a number of data blocks, which contain the standard RTP payload
- data for these encodings. It is noted that all the headers are
- aligned to a 32 bit boundary, but that the payload data will
- typically not be aligned. If multiple redundant encodings are
- carried in a packet, they should correspond to different time
- intervals: there is no reason to include multiple copies of data for
- a single time interval within a packet.
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |F| block PT | timestamp offset | block length |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
-
-
- Perkins, et. al. Standards Track [Page 4]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- The bits in the header are specified as follows:
-
-
- F: 1 bit First bit in header indicates whether another header block
- follows. If 1 further header blocks follow, if 0 this is the
- last header block.
-
- block PT: 7 bits RTP payload type for this block.
-
- timestamp offset: 14 bits Unsigned offset of timestamp of this block
- relative to timestamp given in RTP header. The use of an unsigned
- offset implies that redundant data must be sent after the primary
- data, and is hence a time to be subtracted from the current
- timestamp to determine the timestamp of the data for which this
- block is the redundancy.
-
- block length: 10 bits Length in bytes of the corresponding data
- block excluding header.
-
- It is noted that the use of an unsigned timestamp offset limits the
- use of redundant data slightly: it is not possible to send
- redundancy before the primary encoding. This may affect schemes
- where a low bandwidth coding suitable for redundancy is produced
- early in the encoding process, and hence could feasibly be
- transmitted early. However, the addition of a sign bit would
- unacceptably reduce the range of the timestamp offset, and increasing
- the size of the field above 14 bits limits the block length field.
- It seems that limiting redundancy to be transmitted after the primary
- will cause fewer problems than limiting the size of the other fields.
-
- The timestamp offset for a redundant block is measured in the same
- units as the timestamp of the primary encoding (ie: audio samples,
- with the same clock rate as the primary). The implication of this is
- that the redundant encoding MUST be sampled at the same rate as the
- primary.
-
- It is further noted that the block length and timestamp offset are 10
- bits, and 14 bits respectively; rather than the more obvious 8 and 16
- bits. Whilst such an encoding complicates parsing the header
- information slightly, and adds some additional processing overhead,
- there are a number of problems involved with the more obvious choice:
- An 8 bit block length field is sufficient for most, but not all,
- possible encodings: for example 80ms PCM and DVI audio packets
- comprise more than 256 bytes, and cannot be encoded with a single
- byte length field. It is possible to impose additional structure on
- the block length field (for example the high bit set could imply the
- lower 7 bits code a length in words, rather than bytes), however such
- schemes are complex. The use of a 10 bit block length field retains
-
-
-
- Perkins, et. al. Standards Track [Page 5]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- simplicity and provides an enlarged range, at the expense of a
- reduced range of timestamp values.
-
- The primary encoding block header is placed last in the packet. It
- is therefore possible to omit the timestamp and block-length fields
- from the header of this block, since they may be determined from the
- RTP header and overall packet length. The header for the primary
- (final) block comprises only a zero F bit, and the block payload type
- information, a total of 8 bits. This is illustrated in the figure
- below:
-
- 0 1 2 3 4 5 6 7
- +-+-+-+-+-+-+-+-+
- |0| Block PT |
- +-+-+-+-+-+-+-+-+
-
- The final header is followed, immediately, by the data blocks, stored
- in the same order as the headers. There is no padding or other
- delimiter between the data blocks, and they are typically not 32 bit
- aligned. Again, this choice was made to reduce bandwidth overheads,
- at the expense of additional decoding time.
-
- The choice of encodings used should reflect the bandwidth
- requirements of those encodings. It is expected that the redundant
- encoding shall use significantly less bandwidth that the primary
- encoding: the exception being the case where the primary is very
- low-bandwidth and has high processing requirement, in which case a
- copy of the primary MAY be used as the redundancy. The redundant
- encoding MUST NOT be higher bandwidth than the primary.
-
- The use of multiple levels of redundancy is rarely necessary.
- However, in those cases which require it, the bandwidth required by
- each level of redundancy is expected to be significantly less than
- that of the previous level.
-
- 4 Limitations
-
- The RTP marker bit is not preserved for redundant data blocks. Hence
- if the primary (containing this marker) is lost, the marker is lost.
- It is believed that this will not cause undue problems: even if the
- marker bit was transmitted with the redundant information, there
- would still be the possibility of its loss, so applications would
- still have to be written with this in mind.
-
- In addition, CSRC information is not preserved for redundant data.
- The CSRC data in the RTP header of a redundant audio packet relates
- to the primary only. Since CSRC data in an audio stream is expected
- to change relatively infrequently, it is recommended that
-
-
-
- Perkins, et. al. Standards Track [Page 6]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- applications which require this information assume that the CSRC data
- in the RTP header may be applied to the reconstructed redundant data.
-
- 5 Relation to SDP
-
- When a redundant payload is used, it may need to be bound to an RTP
- dynamic payload type. This may be achieved through any out-of-band
- mechanism, but one common way is to communicate this binding using
- the Session Description Protocol (SDP) [6]. SDP has a mechanism for
- binding a dynamic payload types to particular codec, sample rate, and
- number of channels using the "rtpmap" attribute. An example of its
- use (using the RTP audio/video profile [3]) is:
-
- m=audio 12345 RTP/AVP 121 0 5
- a=rtpmap:121 red/8000/1
-
- This specifies that an audio stream using RTP is using payload types
- 121 (a dynamic payload type), 0 (PCM u-law) and 5 (DVI). The "rtpmap"
- attribute is used to bind payload type 121 to codec "red" indicating
- this codec is actually a redundancy frame, 8KHz, and monaural. When
- used with SDP, the term "red" is used to indicate the redundancy
- format discussed in this document.
-
- In this case the additional formats of PCM and DVI are specified.
- The receiver must therefore be prepared to use these formats. Such a
- specification means the sender will send redundancy by default, but
- also may send PCM or DVI. However, with a redundant payload we
- additionally take this to mean that no codec other than PCM or DVI
- will be used in the redundant encodings. Note that the additional
- payload formats defined in the "m=" field may themselves be dynamic
- payload types, and if so a number of additional "a=" attributes may
- be required to describe these dynamic payload types.
-
- To receive a redundant stream, this is all that is required. However
- to send a redundant stream, the sender needs to know which codecs are
- recommended for the primary and secondary (and tertiary, etc)
- encodings. This information is specific to the redundancy format,
- and is specified using an additional attribute "fmtp" which conveys
- format-specific information. A session directory does not parse the
- values specified in an fmtp attribute but merely hands it to the
- media tool unchanged. For redundancy, we define the format
- parameters to be a slash "/" separated list of RTP payload types.
-
- Thus a complete example is:
-
- m=audio 12345 RTP/AVP 121 0 5
- a=rtpmap:121 red/8000/1
- a=fmtp:121 0/5
-
-
-
- Perkins, et. al. Standards Track [Page 7]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- This specifies that the default format for senders is redundancy with
- PCM as the primary encoding and DVI as the secondary encoding.
- Encodings cannot be specified in the fmtp attribute unless they are
- also specified as valid encodings on the media ("m=") line.
-
- 6 Security Considerations
-
- RTP packets containing redundant information are subject to the
- security considerations discussed in the RTP specification [2], and
- any appropriate RTP profile (for example [3]). This implies that
- confidentiality of the media streams is achieved by encryption.
- Encryption of a redundant data stream may occur in two ways:
-
- 1.The entire stream is to be secured, and all participants are
- expected to have keys to decode the entire stream. In this case,
- nothing special need be done, and encryption is performed in the
- usual manner.
-
- 2.A portion of the stream is to be encrypted with a different
- key to the remainder. In this case a redundant copy of the last
- packet of that portion cannot be sent, since there is no
- following packet which is encrypted with the correct key in which
- to send it. Similar limitations may occur when
- enabling/disabling encryption.
-
- The choice between these two is a matter for the encoder only.
- Decoders can decrypt either form without modification.
-
- Whilst the addition of low-bandwidth redundancy to an audio stream is
- an effective means by which that stream may be protected against
- packet loss, application designers should be aware that the addition
- of large amounts of redundancy will increase network congestion, and
- hence packet loss, leading to a worsening of the problem which the
- use of redundancy was intended to solve. At its worst, this can lead
- to excessive network congestion and may constitute a denial of
- service attack.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Perkins, et. al. Standards Track [Page 8]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- 7 Example Packet
-
- An RTP audio data packet containing a DVI4 (8KHz) primary, and a
- single block of redundancy encoded using 8KHz LPC (both 20ms
- packets), as defined in the RTP audio/video profile [3] is
- illustrated:
-
- 0 1 2 3
- 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |V=2|P|X| CC=0 |M| PT | sequence number of primary |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | timestamp of primary encoding |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- | synchronization source (SSRC) identifier |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |1| block PT=7 | timestamp offset | block length |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- |0| block PT=5 | |
- +-+-+-+-+-+-+-+-+ +
- | |
- + LPC encoded redundant data (PT=7) +
- | (14 bytes) |
- + +---------------+
- | | |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ +
- | |
- + +
- | |
- + +
- | |
- + +
- | DVI4 encoded primary data (PT=5) |
- + (84 bytes, not to scale) +
- / /
- + +
- | |
- + +
- | |
- + +---------------+
- | |
- +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
-
-
-
-
-
-
-
-
- Perkins, et. al. Standards Track [Page 9]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- 8 Authors' Addresses
-
- Colin Perkins/Isidor Kouvelas/Orion Hodson/Vicky Hardman
- Department of Computer Science
- University College London
- London WC1E 6BT
- United Kingdom
-
- EMail: {c.perkins|i.kouvelas|o.hodson|v.hardman}@cs.ucl.ac.uk
-
-
- Mark Handley
- USC Information Sciences Institute
- c/o MIT Laboratory for Computer Science
- 545 Technology Square
- Cambridge, MA 02139, USA
-
- EMail: mjh@isi.edu
-
-
- Jean-Chrysostome Bolot/Andres Vega-Garcia/Sacha Fosse-Parisis
- INRIA Sophia Antipolis
- 2004 Route des Lucioles, BP 93
- 06902 Sophia Antipolis
- France
-
- EMail: {bolot|avega|sfosse}@sophia.inria.fr
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Perkins, et. al. Standards Track [Page 10]
-
- RFC 2198 RTP Payload for Redundant Audio Data September 1997
-
-
- 9 References
-
- [1] V.J. Hardman, M.A. Sasse, M. Handley and A. Watson; Reliable
- Audio for Use over the Internet; Proceedings INET'95, Honalulu, Oahu,
- Hawaii, September 1995. http://www.isoc.org/in95prc/
-
- [2] Schulzrinne, H., Casner, S., Frederick R., and V. Jacobson, "RTP:
- A Transport Protocol for Real-Time Applications", RFC 1889, January
- 1996.
-
- [3] Schulzrinne, H., "RTP Profile for Audio and Video Conferences
- with Minimal Control", RFC 1890, January 1996.
-
- [4] M. Yajnik, J. Kurose and D. Towsley; Packet loss correlation in
- the MBone multicast network; IEEE Globecom Internet workshop, London,
- November 1996
-
- [5] J.-C. Bolot and A. Vega-Garcia; The case for FEC-based error
- control for packet audio in the Internet; ACM Multimedia Systems,
- 1997
-
- [6] Handley, M., and V. Jacobson, "SDP: Session Description Protocol
- (draft 03.2)", Work in Progress.
-
- [7] Bradner, S., "Key words for use in RFCs to indicate requirement
- levels", RFC 2119, March 1997.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Perkins, et. al. Standards Track [Page 11]
-
-