WebRTC Forward Error Correction RequirementsGoogle747 6th St SKirklandWA98033United States of Americajustin@uberti.name
RAI
RTPFECThis document provides information and requirements for the use of Forward
Error Correction (FEC) by WebRTC implementations.Status of This Memo
This is an Internet Standards Track document.
This document is a product of the Internet Engineering Task Force
(IETF). It represents the consensus of the IETF community. It has
received public review and has been approved for publication by
the Internet Engineering Steering Group (IESG). Further
information on Internet Standards is available in Section 2 of
RFC 7841.
Information about the current status of this document, any
errata, and how to provide feedback on it may be obtained at
.
Copyright Notice
Copyright (c) 2021 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
() in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided without
warranty as described in the Simplified BSD License.
Table of Contents
. Introduction
. Terminology
. Types of FEC
. Separate FEC Stream
. Redundant Encoding
. Codec-Specific In-Band FEC
. FEC for Audio Content
. Recommended Mechanism
. Negotiating Support
. FEC for Video Content
. Recommended Mechanism
. Negotiating Support
. FEC for Application Content
. Implementation Requirements
. Adaptive Use of FEC
. Security Considerations
. IANA Considerations
. References
. Normative References
. Informative References
Acknowledgements
Author's Address
IntroductionIn situations where packet loss is high, or perfect media quality is
essential, Forward Error Correction (FEC) can be used to proactively
recover from packet losses. This specification provides guidance on which
FEC mechanisms to use, and how to use them, for WebRTC
implementations.TerminologyThe key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED",
"MAY", and "OPTIONAL" in this document are to be interpreted as
described in BCP 14
when, and only when, they appear in all capitals, as shown here.Types of FECFEC describes the sending of redundant information in an outgoing
packet stream so that information can still be recovered even in the event
of packet loss. There are multiple ways this can be accomplished
for RTP media streams ; this section enumerates
the various mechanisms available and describes their trade-offs.Separate FEC StreamThis approach, as described in ,
sends FEC packets as an independent RTP stream with its own
synchronization source (SSRC) and payload
type, multiplexed with the primary encoding. While this approach can
protect multiple packets of the
primary encoding with a single FEC packet, each FEC packet will have its
own IP/UDP/RTP/FEC header, and this overhead can be excessive in some
cases, e.g., when protecting each primary packet with a FEC packet.This approach allows for recovery of entire RTP packets, including
the full RTP header.Redundant EncodingThis approach, as described in
, allows for redundant data to be piggybacked
on an existing primary encoding, all in a single packet. This redundant
data may be an exact copy of a previous payload, or for codecs that
support variable-bitrate encodings, the redundant data may possibly be a smaller, lower-quality
representation. In certain cases, the redundant data could include
encodings of multiple prior audio frames.Since there is only a single set of packet headers, this approach
allows for a very efficient representation of primary and redundant data.
However, this savings is only realized when the data all fits into a
single packet (i.e. the size is less than a MTU). As a result, this
approach is generally not useful for video content.As described in
, this approach cannot recover
certain parts of the RTP header, including the marker bit, contributing source (CSRC)
information, and header extensions.Codec-Specific In-Band FECSome audio codecs, notably Opus
and Adaptive Multi-Rate (AMR)
, support their own in-band FEC mechanism,
where redundant data is included in the codec payload. This is similar
to the redundant encoding mechanism described above, but as it adds no
additional framing, it can be slightly more efficient.For Opus, audio frames deemed important are re-encoded at a lower
bitrate and appended to the next payload, allowing partial recovery
of a lost packet. This scheme is fairly efficient; experiments
performed indicate that when Opus FEC is used, the overhead imposed is
only about 20-30%, depending on the amount of protection needed. Note
that this mechanism can only carry redundancy information for the
immediately preceding audio frame; thus the decoder cannot fully recover
multiple consecutive lost packets, which can be a problem on wireless
networks. See
,
and this Opus mailing list post
for more details.For AMR and AMR-Wideband (AMR-WB), packets can contain copies or lower-quality
encodings of multiple prior audio frames. See
,
for details on this mechanism.In-band FEC mechanisms cannot recover any of the RTP header.FEC for Audio ContentThe following section provides guidance on how to best use FEC for
transmitting audio data. As indicated in
below, FEC should only be activated if
network conditions warrant it, or upon explicit application request.Recommended MechanismWhen using variable-bitrate codecs without an internal FEC,
redundant encoding
(as described in )
with lower-fidelity
version(s) of the previous packet(s) is RECOMMENDED. This provides
reasonable protection of the payload with only moderate bitrate
increase, as the redundant encodings can be significantly smaller than
the primary encoding.When using the Opus codec, use of the built-in Opus FEC mechanism is
RECOMMENDED. This provides reasonable protection of the audio stream
against individual losses, with minimal overhead. Note that, as
indicated above, the built-in Opus FEC only provides single-frame
redundancy; if multi-packet protection is needed, the aforementioned
redundant encoding with reduced-bitrate Opus encodings
SHOULD be used instead.When using the AMR/AMR-WB codecs, use of their built-in FEC
mechanism is RECOMMENDED. This provides slightly more efficient
protection of the audio stream than redundant encoding does.When using constant-bitrate codecs, e.g.,
PCMU , redundant encoding MAY be used, but
this will result in a potentially significant bitrate increase, and
suddenly increasing bitrate to deal with losses from congestion
may actually make things worse.Because of the lower packet rate of audio encodings, usually a
single packet per frame, use of a separate FEC stream comes with a
higher overhead than other mechanisms, and therefore is NOT RECOMMENDED.As mentioned above, the recommended mechanisms do not allow recovery
of parts of the RTP header that may be important in certain audio
applications, e.g., CSRCs and RTP header extensions like those
specified in
and
. Implementations SHOULD account for this and
attempt to approximate this information, using an approach similar to
those described in
, and
.Negotiating SupportSupport for redundant encoding of a given RTP stream SHOULD be
indicated by including audio/red
as an additional supported media type for the
associated "m=" section in the SDP offer
. Answerers can reject the use of redundant
encoding by not including the audio/red media type in the corresponding
"m=" section in the SDP answer.Support for codec-specific FEC mechanisms are typically indicated
via "a=fmtp" parameters.For Opus, a receiver MUST indicate that it is prepared to use
incoming FEC data with the "useinbandfec=1" parameter, as specified in
. This parameter is declarative and can be
negotiated separately for either media direction.For AMR/AMR-WB, support for redundant encoding, and the maximum
supported depth, are controlled by the "max-red" parameter, as
specified in
.
Receivers MUST include this
parameter, and set it to an appropriate value, as specified in
, Table 6.3.FEC for Video ContentThe following section provides guidance on how to best use FEC for
transmitting video data. As indicated in
below, FEC should only be activated if
network conditions warrant it, or upon explicit application request.Recommended MechanismVideo frames, due to their size, often require multiple RTP packets.
As discussed above, a separate FEC stream can protect multiple packets
with a single FEC packet. In addition, the Flexible FEC mechanism
described in
is also capable
of protecting multiple RTP streams via a single FEC stream, including
all the streams that are part of a BUNDLE group
. As a
result, for video content, use of a separate FEC stream with the
Flexible FEC RTP payload format is RECOMMENDED.To process the incoming FEC stream, the receiver can demultiplex it
by SSRC, and then correlate it with the appropriate primary stream(s)
via the CSRC(s) present in the RTP header of Flexible FEC repair packets, or
the SSRC field present in the FEC header of Flexible FEC retransmission
packets.Negotiating SupportSupport for an SSRC-multiplexed Flexible FEC stream to protect a given RTP
stream SHOULD be indicated by including video/flexfec (described in
) as
an additional supported media type for the associated "m=" section in the
SDP offer
. As mentioned above, when BUNDLE is used,
only a single Flexible FEC repair stream will be created for each BUNDLE
group, even if Flexible FEC is negotiated for each primary stream.Answerers can reject the use of SSRC-multiplexed FEC by not
including the video/flexfec media type in the corresponding "m=" section in
the SDP answer.Use of FEC-only "m=" lines, and grouping using the SDP group mechanism
as described in
, is not currently defined for
WebRTC, and SHOULD NOT be offered.Answerers SHOULD reject any FEC-only "m=" lines, unless they
specifically know how to handle such a thing in a WebRTC context
(perhaps defined by a future version of the WebRTC specifications).FEC for Application ContentWebRTC also supports the ability to send generic application data, and
provides transport-level retransmission mechanisms to support full and
partial (e.g., timed) reliability. See
for details.Because the application can control exactly what data to send, it has
the ability to monitor packet statistics and perform its own
application-level FEC if necessary.As a result, this document makes no recommendations regarding FEC for
the underlying data transport.Implementation RequirementsTo support the functionality recommended above, implementations MUST
be able to receive and make use of the relevant FEC formats for their
supported audio codecs, and MUST indicate this support, as described in
. Use of these formats when sending, as
mentioned above, is RECOMMENDED.The general FEC mechanism described in
SHOULD also be
supported, as mentioned in
.Implementations MAY support additional FEC mechanisms if desired, e.g.,
.Adaptive Use of FECBecause use of FEC always causes redundant data to be transmitted, and
the total amount of data must remain within any bandwidth limits indicated
by congestion control and the receiver, this will lead to less bandwidth
available for the primary encoding, even when the redundant data is not
being used. This is in contrast to methods like RTX
or Flexible FEC's retransmission mode (),
which only transmit redundant data when necessary, at the cost of an
extra round trip and thereby increased media latency.Given this, WebRTC implementations SHOULD prefer using RTX or
Flexible FEC retransmissions instead of FEC when the connection RTT is within
the application's latency budget, and otherwise SHOULD only
transmit the amount of FEC needed to protect against the observed packet
loss (which can be determined, e.g., by monitoring transmit packet loss
data from RTP Control Protocol (RTCP) receiver reports
), unless the application indicates it is
willing to pay a quality penalty to proactively avoid losses.Note that when probing bandwidth, i.e., speculatively sending extra
data to determine if additional link capacity exists, FEC data SHOULD be
used as the additional data. Given that extra data is going to be sent
regardless, it makes sense to have that data protect the primary payload;
in addition, FEC can typically be applied in a way that increases
bandwidth only modestly, which is necessary when probing.When using FEC with layered codecs, e.g.,
, where only base layer frames are critical to
the decoding of future frames, implementations SHOULD only apply FEC to
these base layer frames.Finally, it should be noted that, although applying redundancy is often
useful in protecting a stream against packet loss, if the loss is caused
by network congestion, the additional bandwidth used by the redundant
data may actually make the situation worse and can lead to significant
degradation of the network.Security ConsiderationsIn the WebRTC context, FEC is specifically concerned with recovering
data from lost packets; any corrupted packets will be discarded by the
Secure Real-Time Transport Protocol (SRTP)
decryption process. Therefore, as described
in , the default processing when
using FEC with SRTP is to perform FEC followed by SRTP at the sender, and
SRTP followed by FEC at the receiver. This ordering is used for all the
SRTP protection profiles used in DTLS-SRTP
, which are enumerated in
.Additional security considerations for each individual FEC mechanism
are enumerated in their respective documents.IANA ConsiderationsThis document requires no actions from IANA.ReferencesNormative ReferencesKey words for use in RFCs to Indicate Requirement LevelsIn many standards track documents several words are used to signify the requirements in the specification. These words are often capitalized. This document defines these words as they should be interpreted in IETF documents. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.RTP Payload for Redundant Audio DataThis document describes a payload format for use with the real-time transport protocol (RTP), version 2, for encoding redundant audio data. [STANDARDS-TRACK]An Offer/Answer Model with Session Description Protocol (SDP)This document defines a mechanism by which two entities can make use of the Session Description Protocol (SDP) to arrive at a common view of a multimedia session between them. In the model, one participant offers the other a description of the desired session from their perspective, and the other participant answers with the desired session from their perspective. This offer/answer model is most useful in unicast sessions where information from both participants is needed for the complete view of the session. The offer/answer model is used by protocols like the Session Initiation Protocol (SIP). [STANDARDS-TRACK]RTP Payload Format and File Storage Format for the Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) Audio CodecsThis document specifies a Real-time Transport Protocol (RTP) payload format to be used for Adaptive Multi-Rate (AMR) and Adaptive Multi-Rate Wideband (AMR-WB) encoded speech signals. The payload format is designed to be able to interoperate with existing AMR and AMR-WB transport formats on non-IP networks. In addition, a file format is specified for transport of AMR and AMR-WB speech data in storage mode applications such as email. Two separate media type registrations are included, one for AMR and one for AMR-WB, specifying use of both the RTP payload format and the storage format. This document obsoletes RFC 3267. [STANDARDS-TRACK]Forward Error Correction Grouping Semantics in the Session Description ProtocolThis document defines the semantics for grouping the associated source and FEC-based (Forward Error Correction) repair flows in the Session Description Protocol (SDP). The semantics defined in this document are to be used with the SDP Grouping Framework (RFC 5888). These semantics allow the description of grouping relationships between the source and repair flows when one or more source and/or repair flows are associated in the same group, and they provide support for additive repair flows. SSRC-level (Synchronization Source) grouping semantics are also defined in this document for Real-time Transport Protocol (RTP) streams using SSRC multiplexing. [STANDARDS-TRACK]RTP Payload Format for the Opus Speech and Audio CodecThis document defines the Real-time Transport Protocol (RTP) payload format for packetization of Opus-encoded speech and audio data necessary to integrate the codec in the most compatible way. It also provides an applicability statement for the use of Opus over RTP. Further, it describes media type registrations for the RTP payload format.Ambiguity of Uppercase vs Lowercase in RFC 2119 Key WordsRFC 2119 specifies common key words that may be used in protocol specifications. This document aims to reduce the ambiguity by clarifying that only UPPERCASE usage of the key words have the defined special meanings.RTP Payload Format for Flexible Forward Error Correction (FEC)This document defines new RTP payload formats for the Forward Error Correction (FEC) packets that are generated by the non-interleaved and interleaved parity codes from source media encapsulated in RTP. These parity codes are systematic codes (Flexible FEC, or "FLEX FEC"), where a number of FEC repair packets are generated from a set of source packets from one or more source RTP streams. These FEC repair packets are sent in a redundancy RTP stream separate from the source RTP stream(s) that carries the source packets. RTP source packets that were lost in transmission can be reconstructed using the source and repair packets that were received. The non-interleaved and interleaved parity codes that are defined in this specification offer a good protection against random and bursty packet losses, respectively, at a cost of complexity. The RTP payload formats that are defined in this document address scalability issues experienced with the earlier specifications and offer several improvements. Due to these changes, the new payload formats are not backward compatible with earlier specifications; however, endpoints that do not implement this specification can still work by simply ignoring the FEC repair packets.IP Multimedia Subsystem (IMS); Multimedia telephony; Media handling and interaction3GPPInformative ReferencesSubject: Opus FECXiphmessage to the opus mailing listRTP: A Transport Protocol for Real-Time ApplicationsThis memorandum describes RTP, the real-time transport protocol. RTP provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of- service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers. The protocol supports the use of RTP-level translators and mixers. Most of the text in this memorandum is identical to RFC 1889 which it obsoletes. There are no changes in the packet formats on the wire, only changes to the rules and algorithms governing how the protocol is used. The biggest change is an enhancement to the scalable timer algorithm for calculating when to send RTCP packets in order to minimize transmission in excess of the intended rate when many participants join a session simultaneously. [STANDARDS-TRACK]The Secure Real-time Transport Protocol (SRTP)This document describes the Secure Real-time Transport Protocol (SRTP), a profile of the Real-time Transport Protocol (RTP), which can provide confidentiality, message authentication, and replay protection to the RTP traffic and to the control traffic for RTP, the Real-time Transport Control Protocol (RTCP). [STANDARDS-TRACK]RTP Retransmission Payload FormatRTP retransmission is an effective packet loss recovery technique for real-time applications with relaxed delay bounds. This document describes an RTP payload format for performing retransmissions. Retransmitted RTP packets are sent in a separate stream from the original RTP stream. It is assumed that feedback from receivers to senders is available. In particular, it is assumed that Real-time Transport Control Protocol (RTCP) feedback as defined in the extended RTP profile for RTCP-based feedback (denoted RTP/AVPF) is available in this memo. [STANDARDS-TRACK]RTP Payload Format for Generic Forward Error CorrectionThis document specifies a payload format for generic Forward Error Correction (FEC) for media data encapsulated in RTP. It is based on the exclusive-or (parity) operation. The payload format described in this document allows end systems to apply protection using various protection lengths and levels, in addition to using various protection group sizes to adapt to different media and channel characteristics. It enables complete recovery of the protected packets or partial recovery of the critical parts of the payload depending on the packet loss situation. This scheme is completely compatible with non-FEC-capable hosts, so the receivers in a multicast group that do not implement FEC can still work by simply ignoring the protection data. This specification obsoletes RFC 2733 and RFC 3009. The FEC specified in this document is not backward compatible with RFC 2733 and RFC 3009. [STANDARDS-TRACK]RTP Payload Format for ITU-T Recommendation G.711.1This document specifies a Real-time Transport Protocol (RTP) payload format to be used for the ITU Telecommunication Standardization Sector (ITU-T) G.711.1 audio codec. Two media type registrations are also included. [STANDARDS-TRACK]Framework for Establishing a Secure Real-time Transport Protocol (SRTP) Security Context Using Datagram Transport Layer Security (DTLS)This document specifies how to use the Session Initiation Protocol (SIP) to establish a Secure Real-time Transport Protocol (SRTP) security context using the Datagram Transport Layer Security (DTLS) protocol. It describes a mechanism of transporting a fingerprint attribute in the Session Description Protocol (SDP) that identifies the key that will be presented during the DTLS handshake. The key exchange travels along the media path as opposed to the signaling path. The SIP Identity mechanism can be used to protect the integrity of the fingerprint attribute from modification by intermediate proxies. [STANDARDS-TRACK]Datagram Transport Layer Security (DTLS) Extension to Establish Keys for the Secure Real-time Transport Protocol (SRTP)This document describes a Datagram Transport Layer Security (DTLS) extension to establish keys for Secure RTP (SRTP) and Secure RTP Control Protocol (SRTCP) flows. DTLS keying happens on the media path, independent of any out-of-band signalling channel present. [STANDARDS-TRACK]VP8 Data Format and Decoding GuideThis document describes the VP8 compressed video data format, together with a discussion of the decoding procedure for the format. This document is not an Internet Standards Track specification; it is published for informational purposes.A Real-time Transport Protocol (RTP) Header Extension for Client-to-Mixer Audio Level IndicationThis document defines a mechanism by which packets of Real-time Transport Protocol (RTP) audio streams can indicate, in an RTP header extension, the audio level of the audio sample carried in the RTP packet. In large conferences, this can reduce the load on an audio mixer or other middlebox that wants to forward only a few of the loudest audio streams, without requiring it to decode and measure every stream that is received. [STANDARDS-TRACK]A Real-time Transport Protocol (RTP) Header Extension for Mixer-to-Client Audio Level IndicationThis document describes a mechanism for RTP-level mixers in audio conferences to deliver information about the audio level of individual participants. Such audio level indicators are transported in the same RTP packets as the audio data they pertain to. [STANDARDS-TRACK]Definition of the Opus Audio CodecThis document defines the Opus interactive speech and audio codec. Opus is designed to handle a wide range of interactive audio applications, including Voice over IP, videoconferencing, in-game chat, and even live, distributed music performances. It scales from low bitrate narrowband speech at 6 kbit/s to very high quality stereo music at 510 kbit/s. Opus uses both Linear Prediction (LP) and the Modified Discrete Cosine Transform (MDCT) to achieve good compression of both speech and music. [STANDARDS-TRACK]WebRTC Data ChannelsNegotiating Media Multiplexing Using the Session Description Protocol (SDP)AcknowledgementsSeveral people provided significant input into this document,
including , , , ,
, , and .Author's AddressGoogle747 6th St SKirklandWA98033United States of Americajustin@uberti.name