(Originally posted on AVS forum, Dec 2013).
Do Audio Measurements Correlate with Sound Quality? Sometimes yes, but usually not, depending on the kind or version of the Audio Measurement being made. My primary experience with audio measurement is in the realm of room acoustics so this will be the context of my response to this question.
For 30 years I’ve been seeing that very acceptable acoustic upgrades in a listening room projects might only measure ½ dB reduction in peaks of the frequency response curve, which seems hardly worthy of such an investment. These small reductions in strength of the peaks in the frequency response curve lie well below the 1 dB threshold of sound level difference and A/B double blind frequency response listening tests prove inconclusive. If the listener can’t differentiate between the frequency response curve of the room before compared to after the acoustic upgrade….all the while being able to easily differentiate between the musicality of the room before compared to after the acoustic upgrade, then what the listener values in audio performance cannot be can’t really be defined by the frequency response curve in the room.
In other words, if we assume that our golden ear listener is right, that acoustic upgrade did audibly improve the musicality of the sound system in the room but the audio test being used to prove that the room sounds better gives inconclusive results, then the audiophile is not wrong, instead it is the audio measurement that is wrong.
There is an old adage in statistical measurements: Correlation does not imply causation. Just because two things happen together does not mean one causes the other. When they don’t then these two things remain correlated, in that they do happen at the same time, but only as a coincidence. For example, at noon it is warmer than at midnight. This is a dependable correlation, when one exists the other exists. But the temperature of the day does not cause the clock hands to move and the clock hand movement does not cause the temperature to change. These are coincidentally correlated but not causally correlated. Coincidence does not mean correlation. The goal in correlation measurements is to sort out and find the cause and effect correlation from the coincidence correlations and this is usually not easy, nor obvious.
In audio, we choose and invest in an upgrade and we hope to be able to sit back and enjoy the fruits of our investment. But for some reason, the enjoyment of our upgrade always seems a little sweeter if we can perform an audio test which proves our upgrade can be physically documented. It’s as if we want to hang a plaque on the wall which certifies that our room is better than before, despite that all the while, we know it is better than before. If we want good results, those that correlate with our perception of improved sound quality, we’d better choose a test that correlates with our version of sound quality.
Frequency Response Curve (FRC) Testing
Let’s go back to using the ever popular frequency response curve in the listening room as a measure of performance. In science we learn that it is always helpful, when trying to understand the meaning or influence of a physical relationship, to take the relationship to the extreme, because in the extreme we get to see how the relationship behaves at either end of it’s realm, and this clarifies our understanding of the more subtle variations we see under more normal circumstances.
In small room audio acoustics we have two extremes, a 100% absorptive room and a 100% reflective room. Our actual room is somewhere in between 100% absorptive and 100% reflective….typically maybe 30 to 50% absorptive, 70 to 50% reflective. We know that out audio gear is linear and we have the published specs to prove it. Linear means the sound level output, the dynamics, is in a fixed proportion to the signal level input. Linear also means that the relative sound levels throughout the frequency range of sound is also fixed in this same proportion to the spectrum of the signal input.
Linear means a dynamic and spectral replication of the signal and yet, people continue to only make spectrum measurements and somehow expect spectrum measurements to explain why music sounds musical or not in different rooms or with different equipment setups. A linear audio system will measure 100% linear in both dynamic and spectral tracking if it is played in an anechoic chamber, a 100% absorptive room. The same linear system will measure 0% linear in dynamic response and 100% linear in spectral response in a reverb chamber, a 100% reflective room.
The main thing to notice is that in general, the % dynamic linearity in music dramatically varies in proportion to the absorption in the room while the % spectral linearity does not vary in proportion to the amount of acoustics in the room.
Never the less, audio writers, reviewers, manufacturers and enthusiasts continue to try to point to the frequency response curve as a way to prove which room sounds better.
There is a coincidence correlation between flattening room response curves and rooms that sound better, it is not absolute. We could make EQ adjustments which replicate the flattening of the room response curve due to adding a serious acoustic upgrade into the room. But does that EQ adjusted room sound as good as and in the same way as with the acoustic improvement? The answer is always an affirmative NO. Clearly, judging how well a system plays music is not based on the frequency response curve (FRC) in the room, it is not the correct measurement to determine the quality of the music.
Modulation Transfer Function (MTF) Testing
About 25 years ago the MTF type of audio system testing first came to my attention at a SynAudCon meeting. It turns out that adding enough acoustics to a room to produce an inaudible ½ dB improvement in the peaks of the sound spectrum can at the same time produce a +6 dB improvement in the Modulation Transfer Function test of the system. Furthermore, the enthusiasm of the audiophile who purchased the acoustic upgrade corresponds much more closely to what one would expect from a +6 dB improvement compared to ½ dB improvement.
The MTF testing is well known in radio communications, analog and digital photography and speech.
It relates to the ability of a system to transmit or receive separate distinct signal events that are very, very close together. It relates to the fine grained sharpness of a photo or the signal to noise ratio of a modulated radio signal. There are two versions of the SNR or signal to noise ratio. One is the signal strength to background noise floor and the other is the sharpness of the line of demarcation between a signal being present and being absent, a signal being on and off, without regard to the steady state background noise floor.
A photo can be blown up to such an extreme that the random features of the photographic noise floor dominate any transition that defines an edge of a distinctive feature. Distinctive features can also blur together because the lens is simply out of focus. We have two types of Signal to Noise Ratios, SNRs, one being a weak signal in a relatively loud background noise floor and the other being a strong signal that lacks focus or ability to deliver differential information. Either way, the ability of the system to render distinct differences is not good. A system with good MTF reproduces the original signal with excellent clarity.
In audio, the first instance of MTF testing was for sonic clarity in sound systems back in 1987. The MTF test results correlates directly with Speech Intelligibility. The B&K RASTI system uses 2 octave wide noise bandwidths and a variety of gate speeds to replicate the salient features of speech. It measures the MTF of these different gated sounds at the listening position and produces a speech intelligibility rating that has excellent correlation to speech intelligibility, which essentially is a measurement of sound quality in speech. Later, around 1990 the MTF technique was expanded to include musical clarity in the form of Musical Articulation Test Tones, MATT by Acoustic Sciences Corp.
See Musical Articulation Test Tones (MATT)
Musical Quality begins with Musical Clarity
Assuming that the high end audio system reproduces perfectly clear high quality music, and acknowledging that that the same system played in a listening room does not deliver perfectly clear high quality music, the only difference between the two received signals is due to the physical properties of the listening room itself. We want to know what changes in the room improve the delivered musical quality. We can agree that there are two fundamental aspects to musical quality, dynamic replication and spectral replication. And further, we can agree that adding acoustics to the listening room enhances the perception of both. At this point we are not discussing the ambience aspects of the listening environment.
Adding acoustics to a room generally improves the EFT (Energy Frequency Time) decay curves by making the RT60 decay rates more uniform. In general, we think we like to have the full spectrum of sound drop into the background noise floor at about the same time, with the bass lingering a little longer than the midrange and upper treble. We especially don’t like to see the low frequency ringers, typically slow decay room modes.
The problem with this type of FFT (Fast Fourier Transform) testing is that although the printouts are very interesting to look at, they also have very little meaning when it comes to translating the test data into some rating of musical satisfaction. When we listen to the sine sweep of the FFT test, we can’t hear anything that tells us about how good music is going to sound.
Angelo Farina, University of Parma acoustics professor in Italy has been independently working on defining musical quality in reproduced sound systems. He discovered the ASC-MATT test because of how well it directly demonstrates to the listener the value in improved musical performance for a room in which acoustic improvements have been made. Later he wrote the equation that transform the traditional FFT waterfall measurements into the ASC MATT type of response curves. He found he could make acoustic adjustments based on FFT analysis but have the listener audition the effect of these adjustments by transforming the FFT test signal into the equivalent MTF test signal and finally be able to to demonstrate to the listener, literally what the improvements in the room sounded like.
He has addressed this question formally and came up with AQT, Acoustic Quality Test.
This is an AES paper on the subject. What is so very interesting is that he was able to transform the FFT test on a room into a MTF test of the same room. When he plays the FFT sine sweep no one is impressed, but when he plays the MTF version of the test data for the listener, everyone is impressed.
Essentially what the listener hears with the ASC-MATT test is Musical Clarity. What gets improved with the addition of acoustics into a listening room is Musical Clarity. Musical Clarity is a well defined term in the evaluation of concert halls. It is essentially a signal to noise ratio of the direct + early reflected energy to late and reverberant energy in the room. It is in effect a signal to noise ratio that describes the sound masking effect which takes place when late reflections are too loud.
Seigfried Linkwitz, a pioneer in audio, has also worked on a variation of the MATT test in his approach to evaluating room acoustics.
We see all sorts of response curves in speaker and room acoustic testing, but in the end, we are still left hanging with the unanswered question…”What does this mean to me?”
I’ve found that with the MATT test, the audiophile listens to the test signal, hits pause, jumps up, pulls out an album, sets the needle down 2/3rds into track 3 and says…. “Just listen. That blurred section of the MATT test is the same blur that has been bothering me for years. Here it comes…….” The golden ear already has cataloged the defects in his room, but until the MATT Test, the only way the room defects can be demo’d is to play short sections of certain musical passages. The audiophile is 10 times more impressed with improved musical clarity, less dynamic blur, than with any amount of uniform RT60s or a flat response curve.
In conclusion, MTF audio measurements directly correlate with audio quality while EFT measurements do not. ETF measurements must be transformed into the equivalent of the MTF measurement, whereupon they do correlate with audio quality. And so it seems that the problem between fitting test data to one’s experience with perceived musical quality, is not the test so much as it is how the test data is presented. There are many aspects by which the same data can be looked at. In some versions of test data, the salient issues appear obscured, while in other versions, the test data is abundantly clear on the subject of musical clarity.
Art Noxon, PE
Pres of ASC & TubeTraps