History of Sound Fusion Recording, in the early days of recording, it was all about making live mono recordings of acoustic bands, a bunch of mics wired direct to tape. Next came the multitrack and recording evolved into click tracks, isobooths, post processing and mixdown sessions. Then the digital age showed up. Acoustic recording was out, sampling and DI was in and everybody’s cousin had a home studio.
But times keep changing. The home studios have evolved into more sophisticated studios and the bands themselves have evolved. Now days there are more mic companies than ever. That’s because studios are doing more micing and less direct. Studios and bands want to do live, ensemble recording. Even rapping is going in this direction. Today’s studio is actually recording more real music, air breathing acoustic sound, than ever before.
But some things never change. Open up a mic we get two kinds of sound. The first is exactly the sound we wanted to get, the direct signal from the talent. The second is exactly the sound we didn’t want to get, the sound from the room. We usually end up needing a cleaner signal at the mic and so our goal, as recording engineers, is to figure out ways to boost the direct signal and cut the room signal.
By following in the footsteps of the last few decades of recording, we try to get an acoustically dry signal, as close to an acoustic DI as possible and then perform the familiar post processing on it to get it into the mix. To do this, we have to build acoustically dead spaces. To do that, we have to kill reflections, all reflections.
But, in so doing, we are also throwing the baby out with the bath water. Who’d have guessed that some of those hated, hunted and hammered room reflections actually help make real sound, sound real? Well, in the olden days, recording was full of this type of “real sound” and today, by implementing a few acoustic tricks, recording can once again, sound real.
The Mic is a 2 Channel Acoustic Premix
In any room, the mic acts like a two track acoustic premix: The direct signal is acoustically mixed right in with the room signature; reflections, echos, reverb and general room noise. And most of the time, the room signature track is too loud. Our job is to boost the Signal to Noise Ratio, the (SNR). To do that we need to either boost the direct signal or fade the room noise, but usually it is some combination of both. What we want is an acoustic fader but air faders, like air guitars, don’t do much for sound.
We boost the direct by getting the talent to eat the mic and reduce the mic gain back down to zero VU. But now our talent sounds like a radio DJ and that just might not be the sound effect the producer wants. In addition, we lose control on dynamics, plosive and proximity effects. To regain control we add the wind ball, dial in EQ, compression and limiters and hope for the best.
Another way to increase the direct to room signal strength ratio is to change the mic pattern. Start closing it down, narrow the focus pattern of the mic, stopping somewhere between cardiod and shotgun. But the tighter the pattern, the more colored the voice, like singing into a mail tube, and we go back to EQ, compression and limiters to try to doctor the track into a semi-real sound. Also moving the mic around the room can come up with some spots that don’t sound as bad as the others. But they usually are no where near the studio window.
And all this time, the talent is locked in a head halo, with the producer saying sweetly: “That was great, but this time, a little more emotion and a little less movement”. Despite the best of everyone’s efforts and tricks of the trade applied to boost the SNR at the mic, all too often the desired effect for the song is either lost or destroyed.
The other way to get a better SNR at the mic is to just dump the room. Kill the room and get pure sound flowing into the mic. Forget EQ, compression and limiters. Just set the mic up in a soundproof anechoic chamber and one would think we have the ultimate recording space, essentially it’s acoustic DI, all direct signal with a -80 dB noise floor. Later, this very dry signal can be revived by post processing, add some warmth and depth with a little delay reverb and some sparkle with a spank from an exciter.
Dry Recording Rooms
When working in dry rooms, any reflection is audible and sounds bad. All it takes is one reflection and the sound we are trying to get picks up a hollow effect. It’s the Comb Filter effect. This is when the desired signal is combined with a lower level and time delayed signal, in other words, an early reflection. The combination imposes a harmonic set of cancels and adds onto the original signal spectrum which sounds like the direct signal was recorded at the bottom of a drinking glass. Dry acoustic recording is very sensitive to the presence of early reflections (comb filter effect), late reflections (echo), fast repeating reflections (flutter echo), boundary loading, mode coupling and finally reverberation. Still, dry recording seems to be the primary tool for today’s recording engineer.
The rule of thumb in a dry recording studio is “the best room is a dead room.” Engineers are trained in AE schools and the school of hard knocks to hate reflections. Engineers hunt them down and kill them (reflections) with fervor, whenever and where ever possible. One might say that recording engineers suffer from a mental condition called reflecto-phobia. It started somewhere in the 60’s when multitracking and post processing became available. Highly infectious, this impaired judgment condition reached epidemic levels in mid 70’s spread to nearly every recording engineer, producer and audio instructor in the industry.
Today, reflecto-phobia is rampant. Music is proudly recorded in acoustically sterile environments. Fueled by fears of comb filter coloration, every single reflection, near or far, that might ever hit a mic has been systematically exterminated over the last 30 years in recording studios. Most of today’s so called “live rooms” are now completely “reflection free zones”.
With the purge of reflections nearly complete, today’s studio music is now completely composed out of separate, sterile, acoustically dead tracks. Preparing these tracks is not much different than being the make up artist in a sonic funeral director, where dead tracks are fluffed and stuffed and somewhat brought back to life by the paint and sparkle tools found in the FX rack.
When people listen to sound, in contrast to microphones, they generally just listen to what they want to hear and pretty much dial out the rest. People can be located pretty far from the talent, compared to a mic, and not even notice the sound of sound in the room. They just hear the talent. People are able to naturally tune the room out and focus in on the talent. The engineer with a mic has to work hard to tune the room out and focus in on the talent.
A person (as well as other critters) is a biological signal processor, not an electronic one. We use a different mechanism to hear than what is built into microphones. A by-product of our hearing system is that we automatically mix all early reflections right into the direct signal and end up hearing one composite “direct” sound. Early reflections are those that arrive within about 1/30 second following the direct signal. It doesn’t matter where those early reflections come from, they just add together (correlation signal detection) in a way that makes the perceived sound be significantly louder than the direct signal. This sound fusion process creates a composite direct signal which has easily more than twice the sound power than the direct signal alone.
Although it doesn’t matter to the sound fusion process where the early reflections come from, we aren’t confused by where the direct sound comes from because of something called the Precedence effect. We cue in on the direction of where a sound comes from by tracking and locking on where the original sound signal comes from. The process of knowing where a sound comes from is called echolocation.
There is one adjustment to echolocation that has to be mentioned here; the Haas effect. Very early reflections, those arriving within 5 ms of the direct signal, will distract us from knowing exactly where the direct signal is coming from. The perceived direction of the direct signal is somewhere between the location of the direct signal and the location of the very early reflection.
People like early reflections. Just step outside, into the middle of a large grassy field, and we can barely hear ourselves, let alone carry a tune or talk to anyone else. That’s what the DI (direct inject) version of life sounds like. Go back inside the house and everything sounds fine and you can carry a tune or a conversation. We’re made to hear direct + early reflections, and to mix them together into one “direct” sound. And this process helps us hear more easily what is going on.
The traditional, studio-dead sound tracks lack life, the quality of sound that makes sound seem to sound real, natural. Yes, there’s always the “fix it in the mix” perspective to dry recording. That means lots of time and money gets spent trying to bring back to life, dead sounding tracks. But studio recording was not always done like this.
In the early days of recording, the luxury of dead studios didn’t exist. Engineers had to record live, entire acoustic bands. They made good records in those days too. Part of their recording process inadvertently included the sound fusion process. Their mono mixes were chuck full of delayed early reflection type signals and there lies the reason they sound so whole, so much like a real recording of a real sonic event.
The Early Years
In the early days of live band recordings, 1950’s, they had one, maybe two takes and then the session was over. The idea was to use a number of mics distributed throughout the group, adjust their position and gain and get a live, hard wired mix down direct to tape on a mono track. Their goal was to capture enough signal to recreate the sound that was heard when sitting in the room. Those days are far from the idea of recording separate tracks in isobooths at various times and in various parts of the country and then mixing them together a few months later.
A good example of the tail end of the early days recording technique was in the RCA Victor StudioB in Nashville back in the 50’ and early 60’s. This topic came up during an AES Sectional presentation on the Quick Sound Field (QSF) recording technique, held there in 2003. The QSF is a modern way to acoustically capture sound fusion at a mic. StudioB had finally been renovated but it wasn’t open to public yet. The room was full of engineers, a lot of new ones who hadn’t even been in the studio since it closed and some ole timers who worked there when they were young. After the QSF presentation was over, the question and discussion time quickly lead back to the recording techniques that used to go on in that room.
StudioB is a shrine. It’s enough to just stand there, inside that room and wonder upon all those hallowed vibrations. The ones that hit the floor tiles and bounced off and those that lie buried still in the wall and ceiling tiles. So many early greats worked and played there. Elvis and the Jordanairs, Roy Orbison, Everly Brothers, Chet Atkins and many more recorded in this old RCA Studio B.
The QSF lecture reminded the ole timers about recording in this room. They talked about the mic setups and how the band played all together, at one time, one song from start to finish, direct to tape. And that was how they made records.
This was all well before multi tracking and mixing capability became available in recording. When multi tracking came, in the 70’s, StudioB accomodated the growing interest in this “new sound” of music. The room was deadened and hosted a small village of iso sound shacks lining the walls. Eventually Nashville was overrun with recording studios and StudioB closed. Now it has been renovated back to the glory of its former years. All the sound shacks are gone now and the room has been returned to it’s original, one big recording room, configuration.
Back in the early days, the room had a 3-mic gain and mix to tape Ampex. Later, more mics were added. There was no isobooths. At best, there were gobos. In this environment, each mic got signal from every instrument. For example, if there were 12 mics and 6 talent sources, there would be at least one direct signal from each talent source arriving at each mic. That means that there were at least 12 different signal path versions of each talent source after mixdown. And then the early reflections have to be added in; floor bounce, glass bounce, other instruments and what not.
The net result after mono mixdown would be that each talent source would have at least 12 direct signals, with time delays ranging from 4 ms out to 25 ms, and levels ranging from zero VU down to -16 dB on the track. And then there would be the reflections, off instruments, floor, glass and what not, filling in the mix with even more random time offset signals. In a 12 mic setup there would actually be captured up to 30 or 40 distinct time delayed signal paths for each talent source. That qualifies as a Sound Fusion effect recording.
The QSF Takes Shape
It all started in 1983, shortly after the TubeTrap was invented. A cylinder shaped bass trap was designed to stand in the corners of rooms. It came with a built-in treble range diffusion panel covering the front half to keep the room brightness up. Rotating the Trap acts like a treble control and changes the brightness of the corner. Bass traps are usually located in corners because that’s where the bass builds up the most. The TubeTrap was the first factory built, UPS shippable bass trap.
It didn’t take long before the bigger studios across the country, always looking to try something new, started buying pallet loads. The seasoned engineers in those days didn’t ask questions. Their ears would tell the truth, if these tubes worked or not. These big studios already had lots of built-in bass traps and didn’t really need much in the way of tuning up, except in a few iso or drum booths.
Curiously, it was the treble range panel that caught their ear. The engineers fooled around with these acoustic cylinders and eventually set up in a semi-circle pattern, a Stonehenge, with the dead side of the Tube facing in. They got what they expected, the room dialed out and inside, that all so familiar, Studio-Dead. Then they rotated the traps and set the bright side in, and all of a sudden, they got a sound they didn’t expect: Studio-Live. The room disappeared and the spotlight hit the talent.
The engineers called the factory to report their discovery and each engineer discovered the same thing. They discovered an acoustic space they hadn’t heard before and that sounded good, very good. They dropped a mic inside and it still sounded good, very good. Eventually the factory replicated these set ups, measured and analyzed what was going on and reported it in a series of AES papers. The Quick Sound Field was born.
The first QSF I saw was in a local studio located near the original TubeTrap factory in Eugene, Oregon, run by relocated LA engineer, Steve Diamond. We had about 30 Tube Traps and were busy tuning his live room when I noticed Stonehenge in front of the window and Steve saying “Check, check, testing one , two…”. He bolted those Tubes down right then and they stayed there till the city tore the building down, some 1000 BiMart commercials later.
A short time later at Pierce Arrow Recorders, in Chicago, engineer/owner Sam Lynn Halonen was experimenting with his first load of TubeTraps. He called in about how he could get great horn sounds. Later, he got more instruments mic’d, including drums, inside the Stonehenge pattern. More recently Sam used the QSF to remic a dry studio recording of an opera singer to add life and dimension.
Reports keep coming in, describing new ways to use the QSF effect to get good sound. The QSF setup created a Haas saturated track. It created a boost in the direct and produced a great signal to work with. It cut room so effectively that it wiped out the need for room acoustics. The QSF produces the acoustic gain adjustment needed at the mic without destroying the desired effect for the song. In fact, the effect for the song is enhanced and can be dialed in. The QSF seems to be a natural for any engineer who has the chance to work with it.
Finally a cure to lifeless sound has been found. Inoculation process requires that tracks be recorded in a Haas Saturated signal, the exact opposite from a Haas Sterile Signal. With some 30 to 60 random time offset Specular Reflections accompanying each direct signal, there is no comb filter effect and the track is completely full of acoustical life, ie, music. Formerly dead mixes can be remixed through an acoustic process of sweetening by playing the dry mix through an acoustic package that creates a plethora of early reflections. Caution, the RT-60 of the early reflection package needs to be in the range of 1/10 second and a very early time gap is recommended to be set at about 3 ms. This cure was discovered when big studio recording engineers started fooling around with TubeTraps in the mid 80’s, endorsed early on by Pete Townshend (Eel Pie Sampling Room) and for the last 10 years with Studio Traps by Bruce “You’ve got to hear this” Swedien.
Variations on Sound Fusion Effect Recording
During this early period, the ASC TubeTrap factory got a few calls from engineers who heard about the QSF sound. One had been doing a radio for many years. He said he developed a magic black box that was his trade secret. It gave him a voice edge over everybody. He put a whole bunch of amplitude adjusted time delays into the box. Fed his mic into one end and got a synthetic QSF sound (direct + a whole lot of random time offset signals) out the other end. The time delays matched exactly the QSF window of about 25 ms. He welcomed us to the club and figured it was time to let the secret out of his magic “voice-box.”
Another engineer contacted the factory and told his story how he had hooked 30 some mics up over the top of a classic opera singer. Each mic was located at a different distance and angle from the talent. He just added them all together and ran it out to the house sound system. He said the sound was fantastic and used the technique many times. He effectively collected some 30 random-time off set signals, all within the 25 ms time window. Each signal was basically the same signal except for the acoustic EQ due to the off axis coloration of the voice. And, as the talent moved around, the sound package didn’t change. The total sound remained the same even though the signal fed into the different mics did change. The listener’s brain can’t tell which reflection is where inside the Sound Fusion effect time window.
Digital reverb was starting be affordable. The reverb plate was being replaced with a 4 adjustable delay/reverb returns. When ambience was set tight (300 to 500 ms) and the delays set shorter (30 to 100 ms) it produces a synthetic ambience, much like a room. By setting it even tighter and shorter, the Sound Fusion effect could be generated. But the big advantage with the acoustic version, the QSF, is that it controls the presence of natural ambience in the room at the mic while adding close and natural flush of early reflections into the acoustic mix at the mic position.
Pete Townshend, NED and Sampling Booths
A little while later digital sampling got started. Synclavier was looking for a Sampling Room and they knew it had to be something different than a vocal booth. ASC built a QSF sampling booth and Synclavier loved it. That booth and the Synclavier followed AES around the world more than once.
Pete Townshend (Who) had heard about TubesTraps was at a session in LA and ran into Bruce Swedien (everybody) and asked him about TubeTraps. Bruce had already checked them out and thought they were all right, he liked them. Pete called the factory a little while later and before long, he had outfitted his Boathouse, a small sampling room with non-parallel walls and round windows at Ell Pie Studios, into a world class sampling room.
Pete was blown away by the sound he got in that room. He wrote the factory and told us his story. It went something like this: “The Boathouse was so smooth that no one could hear which fader ran the nearfield or farfield mics. They sounded the same. For the first time ever, I had to tape along side the faders, labeling the two mics, so my engineers could remember which fader was the nearfield mic and which was the ambient mic.”
The factory asked if he’d consider endorsing the QSF Sampling Room and he said “normally, no” but in this case, he’d be glad to, because recording engineers needed to know about the QSF. And so, three rolls of Hassleblad negatives later, Pete Townshend became the first star to endorse the QSF recording technique.
Deep Space Vocal Booth
A little later, Rockwell Corp contacted the factory. They were doing voiceprints for training astronauts. They were working with something like a 10 open mic studio talkback system, where everybody could be heard, all at once. Only one problem: There was just one send/receive channel, in order to keep the weight down. They decided to chop and sequence the open mic signals so that one transceiver could carry all signals. Chopping the signal train was not a problem, but reconstruction was. How to recreate someone’s voice when you only have 1/10 sampling of the signal? They needed a hot vocal booth to help them develop a voice reconstruction algorithm.
They chose the QSF system. The ASC factory built, shipped and even set up the booth. This room was made out of alternating half round TubeTraps and Plexiglas strips. The see-through walls created a very open feeling. Rockwell engineers used the QSF Sampling Booth to get the most room-free, information-filled version of a person’s voice that was possible. They chopped it up and figured out an algorithm to reconstruct a person’s voice chop sequence into a reasonable facsimile of the voice. The plan worked and the rest is space history.
Evolving the QSF space
These high performance QSF Sampling Rooms worked great for sampling but they were a little too fast for live talent work. A standard iso booth might have an RT-60 of 0.4 seconds and have as few early reflections as possible. These sampling rooms were running an RT-60 in the range of 1/10th second and sported diffusion rates of 1000 random, time delayed, early reflections (distinct specular reflections) per second. This means the room was very dead and at the same time, very bright. The acoustic gain produced by a QSF sampling room was about +10 dB above the direct signal. It was a bright anechoic chamber. QSF vocal or iso booths are still bright but the reverb time is set slower, in the range of ¼ second, so it is a comfortable space to work in.
The QSF Stonehenge package was becoming popular and to get the price down and usability up, ASC developed the StudioTrap, a small diameter (voice range) TubeTrap mounted on an adjustable shaft with a tripod base, like a mic stand. There have been improvements over the years. A hand synch grip was added along with a quieter internal slide clutch. The diffusing reflector sheet was moved forwards to increase the top end reflectivity from 6k to something above 7k Hz. But overall, the StudioTrap remains pretty much today, as it was originally conceived and is the cornerstone of the QSF effect.
Ed McMahon has probably the best known voice in the country. And he travelled a lot all over the country. In between scheduled public appearances, he had to do commercials.
Bruce Swedien Discovers the QSF sound
During this time we noticed an interview with Bruce Swedien. It was clear that Bruce was so acoustically in tune that he probably watched sound run around the room in slow motion. I wrote him, explained a little about the QSF effect and invited him to audition it. “Why certainly,” he’d love to. He’s always looking for new sounds and ways to get them. He tried it and loved it. After a bit, he volunteered to endorse the QSF. He said and still says: “I wish everybody could hear this” because Bruce wants other engineers to get to know the power of the QSF sound and discover for themselves this new recording technique.
A little later, the factory sent a truck load of StudioTraps to join Bruce at René Moore’s large home in Studio City to help Bruce make his first mic training video. At the end of that tape Bruce and I kibitz a while about the QSF Effect. It’s a great tape and a lot of good QSF techniques are demonstrated.
Bruce continues to ship Red Rocket his original set of 14 StudioTraps, from session to session, back and forth across the US just to be sure he’s always ready to add that little kiss of life, the QSF effect, into his tracks. The Studios got so beat up, shabby actually, over the years that the factory offered to replace them, no charge. But Bruce wouldn’t have it. Nobody touches his tried and true, vintage gear. Eventually, a fork lift changed his mind and we were able gently repair, update and refinish his set of original StudioTraps and return them, good as new.
Bruce gets to work in the best live rooms in the world. Like a master chef, he adds just a pinch of the QSF effect to the already nearly perfect live room sound. He uses wide spacing of the StudioTraps and randomizes the reflector positions. He dials in a number of specular Haas reflections to bring forward and capture the essence of a live performance.
QSF Comes Home
Most of us don’t get a chance to work in the best live rooms in the world. Most of us are lucky to be working in small, home or barn studios, something less than perfect rooms. Here, we set the QSF pattern not open, but tight. The smaller the room, the tighter the pattern, the more intense the Haas reflections, which boost the live effect and at the same time, the room is blocked even more. Typical small room recording does very well with only 8 StudioTraps in a semicircle setup, 4 to 5’ in diameter. The tightness of the QSF setup is proportional to the strength of the “Haas/direct” to room reverb ratio.
Recording with the QSF is good for everybody in the studio. The engineer gets the desired sound while the talent doesn’t get worn out trying to make it. It’s easy to find sound you want and the sessions go fast. And there’s a bonus. The halo clamp got tossed. No more: “That’s pretty good, but this time, let’s try to emote just a little more and move just a little less.”
While working in QSF, talent is free to groove to the music without causing a shift in the sound at the mic. As the talent moves, all that’s changing is the arrival time of the various Haas reflections. However, the ensemble package of direct + early reflections remains at the same level and sounding the same. As with the early sampling booths, you can’t tell which signals arrived when, just as long as they are all inside the Hass time package, it’s all just one sound.
This produces a track that needs no riding gain, no limiters, no compressors and no equalization. Just dig out your favorite omni or ribbon mic, back away from the proximity effect and go direct to tape, pretty much no matter what room you are in. And there’s another bonus. What the talent hears inside the QSF field is exactly what they hear later in the control room. There are no surprises when recording with the QSF effect. A QSF track can be processed and mixed just like a regular dry track. And yes, it will amplitude and delay pan very well.
And so, Jennifer Lopez stopped to say thanks on her last album cover to the whole crew at ASC. It was for staying the course and delivering “where’s those round things?” the breath of live sound, the QSF effect, into her vocal tracks on the last 3 albums.
Orriel Smith, coloratura, uses a QSF setup in her home for practice and demo work. She hauls her QSF to the recording studio to make sure she gets the sound she wants.
Being aware that the “voice” is mostly air, I like the idea that StudioTraps create a bright atmosphere to reflect or absorb the air/tone in a way that is controllable and consistent. The standard “recording booth” can seem overly absorbent and airless to me, not to mention claustrophobic. Operatic singing is based on overtones that are large, full with a ringing high presence. The feeling of singing within the StudioTraps is that of being within a “live and active” atmosphere and is extremely predictable. ~ Orriel Smith