Arthur M. Noxon
Acoustic Sciences Corporation
Eugene, Oregon USA
Presented at the 11th AES International Conference: Test and Measurement
Portland, Oregon, May 1992
Coherent and incoherent reflections are very different, both in physical and psychoacoustic properties. Perception effects such as imaging and musicality are very sensitive to the type and tuning of reflections off nearby surfaces. Coherent reflections can have strong correlation coefficients and add information to the direct signal. Incoherent reflections with random phase signals are weak in correlation and provide strong masking effects.
Diffusion is the process of mixing up sound. In a 100% diffuse sound field, there is no sense of acoustic direction, sound comes equally from all directions. Diffusion may be at times a desirable condition for acoustic energy. It is created by a sequence of diffusing reflections. A sound reflection can be either coherent or incoherent. This quality is very important to be specified because the coherence of a reflection has a significant sound masking effect on perception.
A device that helps to develop the state of diffusion by increasing the scattering of sound is called a sound diffuser. There are four types of sound diffusion mechanisms.
- Diffraction (sound bends around corners)
- Refraction (turns by changing wave speeds)
- Reflection (changing direction upon impact)
- Resonance (resonant storage and reradiation)
- The first three sound turning mechanisms are pretty well known. They change the direction of sound but not the time wise evolution of the waveform itself. The scattered sound has the same sonic signature as the incident sound, they are highly correlated and therefore a coherent diffusion process takes place.
The last process, resonance, is not usually considered to be a sound diffuser. Incident sound on a resonator will stimulate the build up and decay of sound in the resonator. Resonant discharges are often practically point sources and so the reradiated sound is well distributed in space. The sound of a ringing, resonant decay has its own time wise evolution. The incident wavetrain will have a pressure vs. time signature that is not followed by the sound of the ensuing resonant decay. Correlation between the incident waveform and the resonant discharge is very low. Resonance forms the basis for an incoherent class of sound reflections.
Time Delayed Perceptions and Reflections
There are distinct time periods that relate to the various properties of perception. Reflections within the first few milliseconds following the direct sound belong to localization, i.e. the perception as to where sound is coming from. Reflections within the next 30 to 50 ms belong to fusion, the develop- ment of sound tone recognition. Reflections outside of 60 ms develop the impression of echo and ambience. The coherency of reflections with respect to the direct signal may well effect the quality of perception differently in each of these three time regions. Once this relationship is known, it can be utilized by recording engineers and acoustic designers to better achieve desired performance.
Reflections of sound that follow the direct signal within 50 ms are not distinctly heard but are blended together, fused into a composite sound. If only one reflection is heard, the phase add and cancel comb filter coloration effects will be heard. If there are many reflections, randomly off set in time, the phase add effect averages out to zero and the composite sounds just like the direct signal. Whenever reflections do not sound like the direct signal, the composite also does not sound like the direct signal. The goal of this paper to introduce and measure coherent and incoherent reflections and then to subjectively evaluate the impact of each when audited within the 50 ms sound fusion time window. It will be shown that incoherent reflections, which may be acceptable in the 60 ms plus time period as ambience or echo signals, are degrading to musical quality if perceived during the 50 ms sound fusion period.
This summation or coloration of signals smeared together within the 50 ms perception window is a distinct aspect of tone recognition but not the whole picture for listening. It does not account for the consequence of variations in the time ordered detail of the harmonic structure in the attack transient. The accuracy of musical quality belongs to the 20 ms attack transient. It is the only event in which the timing and the phase alignment of the overtones in complex signals is detectable. Over the last few years speaker manufacturers have recognized and accommodated both tune and phase alignment in the design of speakers. It is no longer sufficient to know how the sound level of each of each partial varies with time, we must also have correct time alignment and phase of the partials. One technique that measures in this area of psychoacoustic perception is the correlation test.
Correlation is a measure of how similar one signal is to another. If a direct signal (Figure la) causes a simple time delayed reflection (Figure Ib) the correlation factor between the two signals (Figure Ic) is zero everywhere in time except at the time delay and then their correlation is 100%.
If the specular reflection is splintered, scattered out in time, correlation will still exist, but spread out over the range of time over which the multiple reflections take place. Two types of splintered reflection systems were tested and both show correlation to exist over a longer time period than that of the single flat wall bounce.
In Figure 2a is shown a reflection/absorption diffusion grid that is composed of alternating depth of reflecting surfaces interspersed with sound absorbing segments. In this system, every other reflector is curved to backscatter over a wider angle than the adjacent flat reflecting strips.
The Figure 2b shows the ETC for the reflection of 400 to 20K. The multiple reflections are spread over a 2 ms time period. The correlation measurement between the direct signal and the reflection (Figure 2c) also shows a 2 ms wide correlation. Each of the time delayed, scattered reflections is specular, a coherent and faithful reproduction of the direct signal.
A different type of diffuser is composed of a set of troughs at various depths, shown in Figure 3a. High frequency sound entering these wells ricochet some number of times depending on the angle of incidence and the well depth. The ETC of Figure 3b shows a spread in time of the reflected signal of about 5 ms. The correlation for this diffuser using 1/3 octave noise at 3K is (Figure 3c) also spread over a 5 ms period of time. This short wavelength reflection is coherent.
Zero correlation occurs when the reflected signal bears little to no resemblance to the direct signal. This can occur when the reflected signal has no amplitude because it was absorbed. Incident sound onto 2″ of medium density fiberglass does not reflect 1/3 octave noise at 3K. Figure 4a shows the ETC of this absorbed “reflection”. The correlation test in Figure 4b shows “zero” because the silent reflection bears no resemblance to time wise signature of the direct signal.
There is zero correlation if the reflecting signal is not really reflected at all but instead is an independent sound. A whistle tone at 1K was played while the direct 1/3 octave noise at 3K was tested. The tone bears no resemblance to the direct signal and Figure 5 shows zero correlation.
A series of correlation tests were run on each of three types of reflecting surfaces. The signal used was 1/3 octave bandwidth noise on 1/3 octave centers between 125 Hz and 3 KHz. The correlation between the direct signal and the reflection was made for flat wall bounces, for the absorption/reflection diffuser and for the multidepth trough diffuser. For the higher frequencies the correlation of all three reflectors is strong with the time window being spread out according to the degree of multi-reflections for each.
The test set up for this sequence (Figure 1) uses an incident angle of 45° and picks up the reflection also about 45°. There are two data collecting runs. The first one (Figures 6 through 12) ranges in 1/3 octave increments from 125 Hz to 500 Hz using 1/3 octave pink noise. The analyzer steps 0.2 ms, just over 100 times to draw out the correlation curve. The correlation time window is just over 20 ms and is time delayed sufficient to catch the leading edge of the reflecting wavefront.
Because narrow band noise is used, the correlation signature will appear as a sine wave of the frequency that is the center frequency of the 1/3 octave noise. The amplitude of the correlation measurement depends on the amplitude of the received signal and how similar it is to the direct signal. The absorption/reflection diffuser should provide some attenuation of the correlation signal due to reduced reflecting signal strength. The random well depth diffuser has no absorption and any loss in correlation amplitude can be related to either an off axis concentration of reflected sound (lobe beaming) or a signal correlation problem.
The 125 Hz through 500 Hz survey shows the absorption/reflection diffuser to mimic the bare wall bounce faithfully except for a full bandwidth amplitude reduction due to absorption. The random well depth diffuser has a thinning of correlation in the 200 Hz 1/3 octave band (Figure 8c) and again at 500 Hz (Figure 12c). The 200 Hz incoherent reflection is attributable to wood panel resonance and the 500 Hz problem belongs to the 1/4 wavelength resonance of the deepest wells.
A higher frequency series (Figure 13 through 15) shows the same test except the step in 50 ms, four times faster than before. The full test window now is 5 ms. The wave form appears to be longer but only because the time scale is shorter. The weak correlation for random well depth diffusers still exists at 1000 Hz (Figure 13) but by the 2K octave and above both diffusing systems have full and adequate correlation except that the random depth wells have additional multiple reflections (Figure 15) that stretch out over the 5 ms window.
The ETC for the random depth well diffuser was taken with the mic in the bottom of one of the deep wells for frequencies between 200 Hz and 20K Hz. The only indication of possible resonance effects is (Figure 16) the rapid drop of initial reflections followed by a resurgence of energy discharge between 4 and 7 ms after the initial reflection. The waterfall (Figure 17) was taken to try to identify the resonance. It ranges from 50 to 500 Hz over a time period of almost 100 ms. By using the heavy time averaging window of 40 ms, the structural resonance effects below 250 and the 1/4 wavelength at 375 Hz become evident.
There seems to be low correlation when the reflected signal is involved with resonance even though the energy of the reflection is high. The resonant discharge produces a tone that has its own time wise identity, not a simple time delayed and coherent reflection of the direct signal.
Incoherent Reflections and Perception
There is a subjective aspect to incoherent reflections. The demonstration of this effect was first performed in a respected hi end audio manufacturer’s demo room at the 1988 CES, Las Vegas. The audio playback system had random depth well diffuser panels behind and between the speakers, set up to diffuse the front wall bounce.
A CD track was played that had solo classical guitar work. The perceived musical quality of a plucked low E guitar string was radically affected by the reflections out of the random well depth diffuser. All 15 people in attendance simultaneously could repeatedly witness this effect. Its characteristic was identified as being a “colorless note”. The fundamental string tone was present but its expected rich harmonic structure seemed to be obscured.
When the random depth well diffuser panel was covered over with a blanket and the musical section was replayed; the easily recognized and all so familiar sound of a plucked acoustic guitar string returned. High frequency string sounds did not seem to have this “colorless” quality, only the lower frequencies, those with substantial transient partials in the middle octaves, 250 to 750 Hz.
Attack Transient Fidelity
Correlation of continuous sound is a straight-forward statistical sampling process. Trying to do correlation on the attack transient of a plucked guitar string is more difficult because of the short time period of the attack and the long time period of the sustain. A study of the attack transient wave form itself does show the effects of different types of reflecting surfaces.
The signal out of a plucked electric guitar string is shown (Figure 18) with rapid harmonic detail changes in the first 50 ms for a 125 Hz note. The overall long term spectrum for this pluck (Figure 19) shows strong and regular upper partials. The signal was recorded and played back over a small, average speaker. Its sound was reflected off of the three types of surfaces and captured by a mic and storage scope. The first 40 ms of each bounce shows the evolution of the attack transient into the sustain wave form.
The first pluck (Figure 20) series shows a substantial initial 10 ms transient difference between the random well depth diffuser and the other two. Beyond the attack transient is seen a wave form change. With the random well depth diffuser there is strong third harmonic detail added to the positive peaks of the fundamental.
A second pluck at 125 Hz was recorded, this time with more harmonic detail due to a shifted finger position. Again the specular/absorptive reflection is very similar (Figure 21) to the wall reflected signal except for reduced amplitude. The random depth well reflector shows again the first 10 ms attack transient distortion. It also shows harmonic distortion in the sustain particularly with accentuated rise times in the positive part of each fundamental peak.
The persistent upper partial distortion in the 3 to 400 Hz region out of the random depth diffuser led to another test, this time at 400 Hz. Again, serious distortion (Figure 22) in the first 50 ms of the attack transient is observed. Also in the sustain is seen more than simple reduction of levels as with the specular/absorptive diffuser. Here, every other cycle is louder and sharper peaked while adjoining pulses are quieter and more grounded than the other two reflecting surfaces.