RFC Errata
RFC 6716, "Definition of the Opus Audio Codec", September 2012
Note: This RFC has been updated by RFC 8251
Source of RFC: codec (rai)
Errata ID: 4392
Status: Held for Document Update
Type: Technical
Publication Format(s) : TEXT
Reported By: Peter Budny
Date Reported: 2015-06-15
Held for Document Update by: Ben Campbell
Date Held: 2015-06-15
Section 2 says:
The Opus codec scales from 6 kbit/s narrowband mono speech to 510 kbit/s fullband stereo music, with algorithmic delays ranging from 5 ms to 65.2 ms. (further in the same section) To compensate for the different look-ahead required by each layer, the CELT encoder input is delayed by an additional 2.7 ms.
It should say:
The Opus codec scales from 6 kbit/s narrowband mono speech to 510 kbit/s fullband stereo music, with algorithmic delays ranging from 5 ms to 66.5 ms. (further in the same section) To compensate for the different look-ahead required by each layer, the CELT encoder input is delayed by an additional 4 ms.
Notes:
For the latter text, the delays for the CELT and SILK layers must match. SILK has a "5 ms look-ahead for noise shaping estimation", and "1.5 ms delay for sampling rate conversion", totaling 6.5 ms. CELT has a "2.5 ms look-ahead due to the overlapping MDCT windows". Thus, the amount of delay needed to align CELT with SILK is 6.5ms - 2.5ms = 4ms.
The text at the beginning of the section must reflect this as well. The "algorithmic delays" reported apparently include framing (minimum frame size 2.5ms + look-ahead delay in CELT-only mode 2.5ms = 5ms). In that case, the maximum delay is the maximum frame size (60ms) + the maximum delay for look-ahead and resampling (6.5ms) = 66.5ms.
Confirmed by author ("Jmvalin") at https://en.wikipedia.org/wiki/Talk:Opus_(audio_format)/Archive_1#Codec_delay_values