Your Listening Room is Double Tracking Your Sweetspot

“Double tracking” is a recording technique where a singer records a song to get an original track, then re-records the same song to get a second or double track. The engineer mixes the original with the double track to get a double tracked sound, which usually sounds a lot better, thicker than the original sound.

The same thing happens in hi-fi playback rooms.

The “original track” is the direct sound that travels from the speaker to the listening position. The second or “double track” is the sound of the speaker after it has been multiply reflected off the surfaces in the listening room. At the listening position our ear-brain system mixes these two simultaneous tracks together into a double tracked sound, which usually sounds a lot worse, with lumpy tonality and blurred dynamics compared to the original sound.

Sometimes people think the goal of room acoustics is to get rid of that double tracking so only the original sound is heard. That would be in a room without reflections, an anechoic chamber (without echoes), also a well-known acoustically hostile environment. We don’t want to completely eliminate the double tracking effect of room acoustics but we do want to make it more acceptable or friendly to hi-fi playback. We probably want the double tracking room acoustic to have similar tonality and dynamics of the original track and most likely turned down in volume a bit.

The goal of room acoustic upgrades is to re-voice the sound of the room so it is more faithfully replicates the direct signal as opposed to a loud and wild, party of sonic chaos.

How to Listen to the Room Acoustic Double Track

There is a technique by which we can actually hear the doubling track. Run a stereo pair off the line level out of the speaker preamp into a mixer and mix them into a mono track. Run the mono track into a time delay set to about 7ms, reverse the absolute phase of this signal and run it into the left input of a second mixer.

In the meantime, set up an omni mic at the listening position. It will simultaneously record an acoustic mix into mono of two tracks. The first being the direct signal or original track and the other being the room acoustic or double track. Run this mic signal into a mic pre and then into the right channel of the second mixer. Punch the mono output so the left and right signals in the mixer are combined into mono at the headphone output which presents the room acoustic double track we desired to hear.

By adding the time delayed reversed phase version of the signal being sent to the speaker to the acoustic signals sensed by the mic, the original direct signal is subtracted (cancelled) from the doubled, direct + room acoustic, signal leaving only the double track, the sound track of the room acoustic. This ensemble of reflections, the double track, is derived from the original direct signal being transmitted into the room. The direct plus the reflections, the original plus the double tracks are acoustically mixed together at the mic, the listening position. Built into this exercise is the generally realized assumption that the direct signal transmitted by the speakers to the listening position, is independent of any room acoustic artifact and an accurate, linear version of the signal being sent to the speaker.

If we want to double check the time delay on the direct or original signal, measure the distance between front of the speaker baffle board and center of the microphone in inches and divide by 13.54 to get the delay in milliseconds (ms). An 8 foot setback from the speaker would be 8 x 12/13.54 = 7.09 ms delay. The measurement calculation is the rough setting for the delay between the direct signal leaving the speaker and arriving at the microphone.

Make a live sound check. Listen over headphones while playing a series of short ticks into the speakers. Adjust the volume of the delayed speaker signal until the tic seems to be minimized. Then make minor adjustments to the time delay until the delayed speaker signal again seems to be minimized. Go back and readjust the gain of the delayed speaker signal to see if the sound of the direct can be even further minimized. At this point the direct signal, the original track, has been phase cancelled as much as possible. The only sound left in the headphones is the double track, the reflected version of the original track.

Listening Room S/N (dB) Signal to Noise Ratio

The sound system is always located in some sort of a listening room. What we end up listening to is essentially a doubling effect where the direct sound from the speaker is nearly simultaneously combined with hundreds of reflections. Often the reflected part of the sound (also known as room gain) are within a few dB of the direct sound.

We expect that the direct sound is nearly perfect and that the sum total of the reflected sound we are hearing is less than perfect, more likely bordering on sonic chaos, also known as noise. This leads us to the idea of a signal to noise ratio S/N (dB). The signal is the direct sound from the speakers while the Noise is the simultaneously occurring sound that is been reflecting off the surfaces of the room.

Notice that this noise floor is not steady like from an air conditioner, but it fluctuates in proportion, it follows the direct signal. In large concert halls the reflections, echoes and reverberant noise floor occurs well after the direct signal has passed by and are perceived to be something separate from the direct signal, the ambiance of the hall.

In small listening rooms however room reflections are so quick that they are not heard as a separate sound, and seem to occur simultaneous with the direct signal. This small-room noise floor is referred to as the running reverb. For the most part we can’t separate it from the direct signal and we don’t really know what it sounds like.

When we consider room acoustics from a Signal to Noise ratio perspective we want a better S/N ratio. To improve the S/N ratio, the Noise part of the ratio has to be reduced. This particularly applies to image and stage distorting early reflections. But there is an old saying “Don’t throw out the baby with the bath water”. We don’t want to eliminate all the reflections in the room because we’d end up in an intolerable anechoic chamber. Since it’s a given that we must keep some reflections, we need to determine which reflections we keep and how to separate them from the ones we don’t want to keep.

Because some of the room reflections contribute a double acoustic track to the direct original track, all of the room reflections are not noise. If we eliminate the “noisy” version of reflections, hopefully we will be left with a clean room acoustic double track which does support the direct signal, the original track. Noise that blocks our ability to perceive certain sounds is referred to as sound masking. Noisy reflections are those that block our ability to perceive musical detail. This is called “sounds masking”

Sound Masking

What reflections produce sound masking? The best sound masking reflection is one that has the same general tones and tempo as the direct sound but is a phase scrambled version of the original sound. For example: someone yodeling a tune compared to someone gargling the same tune. The time average sound spectrum will be the same in both cases. But the gargled version of the tune has lost its quality of musical clarity. What tone is being played, and when, may be the same, but the separation in timing and the difference in loudness between the tones will not be the same.

If a recording of yodeling is mixed with a recording of gargling for the same song is mixed together, the yodeled version of the song is masked by the gurgling version of the song.

A reflection produces a sound that has the same waveform as the original sound. It might be quieter because of diffusion or not have the same spectral balance because of absorption but whatever parts of the wave form that remain in step will amplify, not mask the perception of the sound. For example, reflection off a wall covered with treble sound panels will reflect the bass part of the sound wave just fine but absorb the treble. The result of a direct plus a bass heavy reflection is something that sounds like the original with the bass turned up, and still perfectly understandable.

Music or speech is a series of sets of short sound bursts separated by pauses. Sound masking interferes with the perception of the short tone bursts as well as the timing, the pace, the pauses between the short sets of bursts. When a wavefront is multiply reflected, the extended repetitive sound of that reflection pattern is distracting to the perception of the initial sound. When a wave train impacts an sound resonator it’s energy is not reflected but captured by the resonator. The energy in the resonator builds until the wave train ends, whereupon the resonator discharges. The waveform of the resonator discharge has nothing to do with the incident or original sound.

And so the “Noisy” part of the “Noise” in a room is the sound masking part of the room acoustics, flutter echoes and resonances. When we look at room acoustics as a given and revise our perception of it be a double track, our interest in room acoustics changes from “killing the noise of room acoustics” to shaping and clarifying the sound of the room’s acoustic double track so that is supports the original track, without adding tonal or dynamic emphasis or blurring.