5.1 Surround Sound Compatibility Within HD Radio And The Existing FM-Stereo Environment

Back to White Papers

5.1 Surround Sound Compatibility Within HD Radio And The Existing FM-Stereo Environment

There are currently 4 proposed methods for surround broadcasting: the three matrix systems and the MPEG Spatial system. Frank Foti compares the systems and points out critical technical issues that broadcasters must be aware of as they consider surround.

5.1 Surround Sound Compatibility Within HD Radio TM
and The Existing FM-Stereo Environment

Frank Foti
Omnia Audio
May, 2005

Preface

Surround is the Killer App for HDFM broadcasting. Based upon the tremendous response at NAB2005, it is clear that more and more people are becoming convinced that FM radio has the potential to take a major step forward with this tech.

As of this writing, there are four proposed methods for surround broadcasting. They can be divided into two categories: the three matrix systems and the MPEG Spatial system. This paper will point out some critical technical issues that broadcasters must be aware of as they consider surround, lest they degrade and damage their FM-Stereo service.

FM-Stereo, Revisited

First, a short review of the FM-Stereo multiplex transmission system, or mpx for short. This system has been operating successfully since 1961. The FM-Stereo system was designed to be compatible with the mono-only system it followed. The Left and Right audio channels are summed to create a mono signal that is received on mono radios. The two channels are also subtracted to create a stereo-difference signal, which is broadcast along with the mono sum signal. From these two signals, the original Left and Right audio channels are recovered: adding them gives you the Left and subtracting them gives you the Right. This is expressed in the algebraic equations: 2L = (L+R) + (L-R) and 2R = (L+R) – (L-R).

The L+R signal resides within a 15kHz range between DC – 15Khz. The L-R signal is added using double sideband suppressed carrier modulation (DSBSC). The 30kHz frequency range is due to the double sidebands, and the carrier is centered at 38kHz. This results in spectrum occupancy between 23kHz – 53kHz. The 38kHz suppressed carrier is created by a 2X multiplication of a 19kHz pilot tone. The pilot is also used for signaling a receiver that a stereo broadcast is present. The block diagram below illustrates the spectrum (after FM demodulation) of the FM-Stereo multiplex system:

The L-R component is most sensitive to transmission impairments. As you’ve probably experienced, multipath can very annoyingly disrupt stereo performance. Whenever L-R content exists in the FM-Stereo transmission, spectra is generated in the 23kHz – 53kHz range. This frequency range also impacts the modulation index, which correlates to the number of RF sideband pairs created by the frequency modulation process. As modulation frequencies and level increase, more sideband pairs are generated, and as such the modulation index value increases. The fundamental technical reason why increasing L-R increases multipath is that the modulation index of the stereo subchannel (23kHz -53kHz) is much lower than the modulation index of the main channel. Multipath rejection in an FM system is a function of the receiver's capture ratio, which improves as modulation index increases.

Multipath occurs when the FM signal arrives at the receiver via multiple paths - hence the name. FM transmission is line-of-sight, but the signal will bounce off other objects, such as buildings or other tall objects, and this creates the numerous paths and subsequent distortions. A signal following a reflected path will arrive a bit delayed compared to the direct signal, as it will have traveled a farther distance. Here is where some basic physics comes into play. The reception problems caused by multipath result from the vector variances of the RF paths, and not the audio signals contained within the FM-Stereo system. On account of this, the delayed signal, upon entering the receiver, create cancellations and attenuations due to phase/time-delay.

The L-R part of the mpx signal is extremely sensitive to any form of disruption created by multipath. When delayed instances of the L-R signal are present, the stereo decoder in a receiver becomes confused because it will not know whether to decode the original or the delayed versions of the signal. The resulting stereo sound field is destroyed and there is the audible distortion we’ve all heard.

The L-R modulation level is critical. Increased L-R level generates additional mpx sub-channel spectrum in the 23kHz – 53kHz domain, and this range of frequencies becomes very fragile during instances of multipath. This is the main reason why manipulation of the L-R signal for stereo enhancement has been problem-prone. Most algorithms used for stereo enhancement generate additional L-R level and this exaggerates the irritating audible effects of multipath.

This is all quite real. These are not theoretical speculations, but rather real-world on-air experiences.

It is worth pointing out that modulation levels in FM transmission are governed by audio dynamics processing. Normal processing yields enough RMS modulation in the L-R signal without exaggerating multipath. Aggressive processing does have the potential to pass the threshold into multipath distortion. But the usual Left/Right processing does not alter the ratio of L-R to L+R, which is the primary cause of multipath-related problems.

Matrix Surround Methods and Multipath

Now to our central topic. Methods that use matrixing for surround are not new. Most of the quad systems of the 1970s employed some form of matrix method. The difference today is that digital implementations offer more flexibility for the encode/decode process. They are capable of moderate surround performance on some types of content, but lack consistency with regard to all content.

A matrix surround encoder can accept a 5.1 multichannel input and produce a stereo Left-total (Lt) and Right-total (Rt) output. The Lt and Rt are a downmix of the surround channels with embedded position cues. This process is based on phase and level changes applied to the multichannel audio signals. Each matrix system applies this technique a bit differently, but the key element is that each of them do alter the phase relationships between the channels as a means to identify the individual 5.1 channels within the 2-channel stereo signal.

Due to the phase modifications, the resulting stereo downmix now contains altered levels within the L+R and L-R signals. In many cases the L-R RMS level is significantly increased when compared to the artistic stereo mix (the independent stereo mix offered by the artist/producer) of the same content. When matrix-generated downmixes are broadcast in FM-Stereo, L-R modulation level is significantly increased. And as we’ve seen, increased L-R modulation makes for increased multipath!

Proof Positive

The following X-Y plots were taken of music content that illustrates the exaggeration of the L-R signal by a matrix method. The well-known song “Wouldn’t It Be Nice” by The Beach Boys was used for the demonstration. A recently released stereo mix from the CD “Pet Sounds” was used for the artistic stereo version. The DVD-Audio disc of “Pet Sounds” contains a 5.1 Surround mix, which was encoded through a Neural Audio 5225 downmix unit. (This particular unit was provided to me personally by the CEO of Neural Audio. Presumably, there is no defect in the unit’s performance.)

These X-Y plots were gathered using a digital oscilloscope that has storage capability. The Left channel is connected to the (X) input and the Right channel to the (Y) input. The scope is set to measure the two signals in X-Y mode. This yields a pattern that is commonly used to measure phase differences between two signals. An in-phase signal will show a straight line at 45 degrees, the top to the right. Likewise, an out-of-phase signal yields a straight line at minus 45 degrees, the top to the left.

The following illustrates the test setup:

The first plot is of the artistic stereo mix:


X-Y Plot: “Wouldn’t It Be Nice” (Artistic Stereo)

This signal appears normal. The content is predominantly in the L+R domain, with a moderate amount of L-R. Now, here is the same segment of the song that was downmixed via the Neural 5225 Surround encoder:


X-Y Plot: “Wouldn’t It Be Nice” (Neural 5225 Downmixed Stereo)

Not only is this significantly different, but the extreme level of L-R content indicates that this will also sound noticeably quieter in mono as the amount of 180 degree out of phase information is very high. This would generate added multipath, and to make matters worse, mono is compromised so much that the perceived mono audio level is down by 3dB or more, a huge loudness loss! The Beach Boys’ piece is not an isolated case; most music material we tested showed significant L-R increase when downmixed with this matrix encoder.

While this points up a significant problem specifically in the Neural system, chances are that the other matrix methods will have similar issues. Probably the Neural algorithm could be modified to reduce L-R level, but then there would be poorer surround performance.

Multipath aside, there is often quite serious degradation of the stereo for other reasons inherent to the matrixing schemes: 1) The 5.1 mix must be mechanically downmixed to 2 channels. Producers make 5.1 versions without regard for downmix and some music does not well survive this process. 2) Matrixing requires phase shifts between the front and rear channels as they are combined to make the “compatible” stereo output. This dulls aural impact and often just sounds “weird.” All matrix methods compromise surround and stereo/mono performance in some fashion. There is no free lunch with the matrix methodology. Even the folks at Sansui now admit to that!

Conclusion

It is vital that any proposed method for surround broadcasting not compromise the transmission performance of stereo or mono. Likewise, there must not be anomalies that exaggerate multipath. The matrix systems fail in both of these areas.

Contrast matrix to the superior performance of the MPEG Spatial System . Taking full advantage of HD Radio’s capabilities, the surround spatial information is transmitted in a separate digital side channel and the original artistic stereo version is broadcast without modification. There is absolutely zero aggravation of multipath because the L-R level is unchanged. Further, MPEG Spatial offers uncompromised surround sound with full separation.

If you are to believe the matrix proponents, their method is a simple solution that can be readily integrated into existing airchains. But there are real problems. To date, there is not a single mainstream station using a matrix system for routine programming. The few matrix broadcasts have been special demonstration program segments on public stations. The tests have been with classical or jazz concerts, which are produced with mostly ambience audio in the surround channels. Pop music production is very different, with elements dramatically positioned around the listener – a case much more difficult for matrix to handle. Let’s see how a matrix system performs on this kind of music being aired on an aggressively processed CHR station in New York City or Los Angeles! Until that happens, take the matrix claims with a big grain of salt.


[1] The Reference Model Architecture for MPEG Spatial Audio Coding. J. Herre, H. Purnhagen, J. Breebaart, C. Faller, S. Disch, K. Kjörling, E. Schuijers, J. Hilpert1, F. Myburg; 118th AES Convention, Barcelona, Spain May, 2005.

Radio Never Sleeps. Neither do we. We're here for you, anytime, with free round-the-clock, 24/7 technical support.
24/7 Technical Support: +1-216-622-0247 or email support@omniaaudio.com
Main Office: +1-216-241-7225