Monitoring of Audio for HDTV Broadcast

As broadcast technology continues to evolve at a seemingly ever-increasing rate, the video element of the experience often takes precedence. The transition from SD to HD is, if not fully complete, then certainly well on the way. Further enhancements in the form of Ultra High Definition, be it 4k or even 8k have been successfully demonstrated and usable hardware is now beginning to appear. In the midst of all this, we should be careful of course not to forget the part that audio plays in broadcast. After all, television without pictures is still radio, but television without sound is just the silent movies. In this article, we will look at techniques and recommendations for ensuring that the audible experience complements that of the visual one.

So, if you are creating or processing audio for HDTV Broadcast, what are the criteria that you should keep in mind to ensure maximum viewer enjoyment? An obvious starting point should be the subjective quality of the mix itself. In order to reliably and repeatedly judge this, along with your preferred set of monitors, a correctly calibrated listening environment should be considered essential. When mixing TV content, general consensus suggests that a measured level of 79dB SPL (using C-Weighted Pink Noise) at the mixing position is appropriate. A monitoring control unit that allows individual speaker level and EQ is ideal for this.

Another element of critical importance these days is the perceived loudness level in order to “comply” with the relevant local standard. This requires the use of a loudness measurement based on ITU.R BS.1770-3 with its inbuilt K weighting and relative gating to give a reliable reading in LUFS/LKFS and can be used a visual guide to check that the integrated or program loudness value is in line with the target value. Equally important is to monitor the loudness range (LRA) of the mix, which whilst there are no specific targets specified, should be maintained at a level appropriate for the final destination of the content. Wall mounted flat panel TV’s and smaller portable devices in noisy environments will not be able to reproduce large LRA values in the same way that a correctly installed home cinema system could. As a guide, for TV broadcast an LRA of 10-15LU would generally be considered acceptable. Again, a well-specified monitoring controller ought to provide these features.

Although not necessarily always true, it is common practice that the audio portion of HDTV Broadcasts is encoded into one or other Dolby® format be it AC3 (Dolby Digital®) or E-AC3 (Dolby Digital Plus®) and this introduces another consideration, that of Metadata. When the Dolby bitstream arrives at the end user’s decoder, metadata parameters will determine some of the behaviours of the decoder and how it will impact the actual reproduced audio. Of all the possible metadata parameters, of prime concern are the 3 D’s, DRC, DOWNMIX and DIALNORM. DRC is a selectable Dynamic Range Control profile that will apply a pre-determined amount of compression that is useful for matching wider dynamic range content to equipment less capable of reproducing the full range. It is also becoming more and more commonplace that broadcast HD Video is often accompanied by 5.1 Surround Audio but many (most?) consumers will be watching on equipment that can only reproduce 2.0 Stereo audio. Here is where DOWNMIX comes in and instructs the decoder how to distribute the six channels of audio across the available two. Finally, DIALNORM is used to implement a scaling factor in the decoder that will normalize all audio output to a loudness level equivalent to -31dBFS. When used correctly, this will ensure that all programs are perceived as having a consistent loudness level but it is dependent on the actual program loudness and DIALNORM values being accurate. All of the above means that there a multiple opportunities for inaccurate metadata parameters to cause unwanted audio effects for the viewer and a method of verifying the values before broadcast would be advantageous. A real time full Dolby® encode and decode sequence would be cumbersome and would introduce significance latency so a better approach is to use metadata emulation. By this method, the effect of setting and varying metadata values can quickly be auditioned and the actual way that the end users decoder will react can be verified with confidence.

D*AP8 MAP Edition
D*AP8 MAP Edition

For more information about how the Jünger D*AP8 MAP Edition can be a powerful tool and aid with all issues mentioned here, please visit our D*AP 8 MAP product page.