Author Topic: Examples of Diffuse Field Correlation (DFC) - hear what it's about (Read 4989 times)

Gutbucket · « **on:** August 04, 2017, 11:23:30 AM »

I commonly mention the term diffuse field correlation (DFC) in posts here because it's an important aspect influencing the sound of our live recordings, especially audience perspective recordings which tend to have significant ambient content. Yet I suspect not everyone here understands what I'm talking about when using that term, and some may even roll their eyes when I bring it up as that term is admittedly so very acoustics-geek-tech sounding!

What is it? Well in short, correlation is one of several possible measures of similarity between signals. We need good correlation between the Left and Right channels for the main elements on which the recording is focused in order to achieve good solidity and stereo imaging - these are the direct field components of the sound. In contrast to that, the diffuse field components consist of sound arriving from all directions equally - primarily the ambient and reverberant components of the recording such as the sound of the room, audience reaction, etc. In practical terms this can be though of as the sound arriving from all other directions outside the direct-sound main window (in Stereo Zoom terms all sounds arriving from outside the SRA), and this diffuse component is optimally captured and reproduced with low correlation between signals.

A good goal is producing recordings which have a highly correlated direct sound component, yet low correlation of the diffuse sound component.

Why and what does that mean in less academic terms? In subjective terms, low DFC is commonly described as sounding "big, wide, open, ambient, spacious, diffuse, airy, voluminous, 3-dimensional," etc. Whereas high correlation might be described as sounding "more narrow, flat, 2-dimensional, but more defined, sharp and precise".

Here's a link to a page at Helmut Wittek's hauptmikrofon website with sound sample examples of the same source recorded using several different microphone setups- http://www.hauptmikrofon.de/audio/diffusefield.html. All of these are coincident microphone setups using Schoeps microphones. You can switch at any time between omni, X/Y mk4, X/Y mk41, Mid/Side (mk4/mk8), and Blumlein (mk8/mk8).

What I suggest is this- Listen through headphones. First listen with only one ear (doesn't matter which), by simply removing the headphone from your other ear. Notice that all the samples sound pretty much the same. Then listen with both ears and notice how different they sound from each other. That difference is primarily due to the differences in diffuse field correlation of each microphone setup.

You can do the same thing listening over speakers, by connecting or disconnecting one speaker. Don't sum the two channels to mono, that's doing something different, introducing other variables and complicating things. You can however send either the left or right channel to both playback channels

The sample is a recording of ambient sound in a university hall and pretty much all diffuse ambience with minimal direct sound of significance, making the differences in DFC very clear. By analogy, it's sort of like just the room and audience noise part of the recording, without the direct sound component from the stage or PA.

heathen · « **Reply #1 on:** August 04, 2017, 11:33:20 AM »

Quote from: Gutbucket on August 04, 2017, 11:23:30 AM

A good goal is producing recordings which have a highly correlated direct sound component, yet low correlation of the diffuse sound component.

In some situations (stack taping, a room/arena that sounds bad, etc) we might want to eliminate as much diffuse sound as possible, right? In that instance, will diffuse sound be more or less noticeable when it has low correlation?

I guess what my question amounts to is whether high or low correlation of diffuse sound can have an effect on the degree to which we perceive it in relation to the direct sound?

Gutbucket · « **Reply #2 on:** August 04, 2017, 11:53:06 AM »

A good balance between direct and diffuse is best. The most appropriate balance is somewhat subjective and there is leaway, but in many cases we don't really have as much control over the direct/ambient balance as we might like. We can really only control it by recording position - how far we are from the source. And to a much lesser extent by mic pickup pattern and angle, but only in a much more limited way. Mostly it's proximity to the source.

No diffuse sound at all is like a straight, dry SBD with no 'verb at all on it. That has good clarity (which is a big part of the battle when we are often swamped in diffuse sound), but no sense of space, location, liveness, there-ness. The closest we get to that is a stack tape close to a very loud PA. Even stack tapes close to moderately loud PAs can have significant ambient diffuse sound to them. Anything from further away than right up at the stack will have significant diffuse sound component, regardless of how we record, and that's good. We need to manage that as best we can in a crappy sounding place, but there are tricks to making it sound better than it might live.

Not so much a question of noticeable or not perhaps as much as acceptable and listenable or not. What is there sounds better or worse. Keeping the direct sound correlated and the diffuse stuff with low correlation allows us to more easily mentally separate the direct form the diffuse sound, so you can hear the direct sound and ignore the diffuse stuff even though it's still there at the same level. That's related to the "cocktail party effect" and how we can focus attention selectively in the real world or in a sufficient stereo reproduction of it.

bombdiggity · « **Reply #3 on:** August 04, 2017, 07:35:49 PM »

I think I had some recent experience with this aspect though it's still slightly a mystery what happened that night.

It was in one of my favorite rooms for sound and clarity and I have a usual spot about 5 rows back (which is actually also up since it has a steep slope).

Almost everything I've recorded there has been really nice but I actually think the last one I should have been a little further back than usual. Usually I run smaller groups there and have been very happy with the results but the last one was an 18 piece big band and that one seems a little dry (even to me). Maybe they were just running a little too much of the PA, which usually isn't needed or a factor, that night, but it has a different feel than all the others from there. I've really liked the other stuff I ran there (from duo to about 10 piece).

It's a narrow room which may be the problem. I ran the same 18 piece much closer in a very wide room and really liked that but it had a much more directional quality and I was ahead of the PA so it was a different feel.

Live and learn I think. The opener was a small group and that has more than enough diffuse sound.

There's definitely an interaction between direct and diffuse and apparently that arc moves a bit from show to show, even in the same room. It feels like I got caught a bit somewhere just around no man's land that night where the direct is satisfying but the diffuse not. OTOH the marimba and piano are really clear and up front which was not always the case. The horn sections usually rule the sound in this group.

Gutbucket · « **Reply #4 on:** August 07, 2017, 10:19:26 AM »

Quote from: bombdiggity on August 04, 2017, 07:35:49 PM

OTOH the marimba and piano are really clear and up front which was not always the case. The horn sections usually rule the sound in this group.

All speculation on my part but clear and upfront piano would seem to indicate strong PA reinforcement. Horns are highly directional and loud, so the sound guy may have been balancing the PA mix against the sound of the horns in the room, pushing the levels of everything else and making compromises through the PA. Kind of like the PA issues balancing with loud guitar amps in a small club. The mixer has less control and the room often gets over driven.

Gutbucket · « **Reply #5 on:** November 10, 2017, 03:21:55 PM »

Link to some more recent discussion on Correlation / Decorrelation starting at this post in the Oddball Microphone Techniques thread- https://taperssection.com/index.php?topic=96009.msg2244172#msg2244172, including a link to a technical paper on the topic.

It's somewhat related to direct/diffuse balance, yet different- more about how the direct and diffuse components of a recording are perceived. I now realize that as recordists we have more direct control over these aspects than I realized in the past, via microphone technique and post-recording mixing methods.

wforwumbo · « **Reply #6 on:** November 29, 2017, 01:09:11 PM »

This is a fascinating thread. I'd like to add a few notes, if you don't mind - both in the theory behind correlation, and in practice for how that translates to the end product on tape.

A quick preface: I did an undergraduate degree in electrical engineering, specializing in signal processing and communications. I also did a masters and am currently not far off from defending my dissertation in acoustics. I have studied room acoustics, binaural perception, and - notably, for this thread - digital room modeling using filter design.

There are three forms of correlation that should be under consideration. The most generic form, and the one that is often referred to when using the term "correlation," is sort of a running window; its output is a function of time. If we have two stationary signals x(t) and y(t), their correlation is computed by taking one signal, sliding it from negative infinity to infinity over a tertiary dummy-time axis (often notated as tau), and for each moment in tau where the sliding signal is moved computing the area overlapping between the two signals and dividing by the total energy contained in both signals. To describe this process, if we have two signals that are perfectly overlapped, their correlation will be 1. If we have two signals that have zero overlap, their correlation will be zero. And if we have two signals that are perfectly out of phase with one another, their correlation will be -1. It's worth noting that this is again instantaneous as we slide one function over another.

While this is convolution (which is what happens when you filter a signal, for example), which is the time-reverse of correlation, I always liked using this graphic when introducing the concept of correlation/convolution: https://en.wikipedia.org/wiki/Convolution#/media/File:Convolution_of_spiky_function_with_box2.gif. This is probably a more concrete explanation of what I mean by the sliding-tau axis.

The second form of correlation is more concretely known as the correlation coefficient. This is a single number, and is more likely what you are used to seeing on audio gear (i.e. a meter of some sort that slides between -1 and +1 over time). This is simply the maximum value of the above correlation function. Note that it is not an all-inclusive metric that tells the whole story; however, in practice in the field more often than naught the correlation coefficient can still be useful when you don't need all of the details given by the full correlation function (for example, lots of radar operates on the correlation coefficient of bit sequences to perform signal extraction in noisy environments).

And lastly, a bit more complex... is the interaural correlation (IAC) and its corresponding coefficient (IACC). The IAC function is similar to the above correlation, however it is usually taken on binaural signals and the sliding window goes only from -80 ms to +80 ms. This shortened window correlates with the latest early reflections in most traditional concert halls, and as such it's used to judge music. For the world of rock music and live rock taping, it's mostly unimportant as the metric is focused almost solely on the perception of unamplified music. I'm mostly going to ignore this one, as it is less useful in demonstrating how DFC is useful to us.

There are two forms of correlating one or a group of signals that give us useful information: cross correlation, and autocorrelation. Cross correlation of two signals is when you perform the correlation calculation on two signals that are unique (for example, a stereo signal). Autocorrelation is when you correlate a signal to itself. Let's take a look at some simple examples to help illustrate the point:

The autocorrelation function of an additive white gaussian noise (AWGN) source is zero for all points, except at tau=0, where the correlation is one. This one's important, I'll talk a bit more about this in a second.
A sine wave correlated with itself is a cosine wave (at tau = zero shift, the signal is perfectly correlated to itself; at 90 degrees out of phase, it's 0-correlated; at 180 degrees out of phase... you get the idea).
A true AWGN source correlated with anything that isn't also a noise source, is theoretically zero - except at tau = 0, where the correlation is 1. In practice we measure some correlation because no noise source is truly AWGN (as one of my professors always noted in class, "I can't wait a thousand years for you to generate true white noise!). What the theory means is that for a true white noise source, sliding it over itself, there will be zero overlap (not even out of phase) at all points in time, except for when the signal is literally copied right on top of itself.
Any periodic signal with complex harmonics will have spikes in its autocorrelation at a peroid equal to its fundamental. Likewise, crossing two seemingly uncorrelated signals can show you a common fundamental; this is particularly useful for things such as modulation, or for example estimating the timing of reflections, or for another example estimating the fundamental pitch of recorded sound (this is how "de-glitched" digital pitch shifters work).
The correlation of a left/right stereo signal will change depending on how "direct" or "diffuse" (more on these terms later, too) the recorded sound is. For more direct sound, the correlation peak will be tighter and more highly-peaked (closer to a dirac delta, or a Gaussian with very low variance), like a sharp mountain with high prominence; for more diffuse sound, it will look closer to a small hill, with a lower peak and the tails on either side being more spread out.

Another quick note I want to make, regarding diffusion and room reflections. There are two major types of reflections in a room: specular and diffuse. Specular reflections are those that follow Snell's Law (angle of incidence = angle of reflection) when a wave is incident on a boundary. This will usually happen if the incident sound has lots of energy, and the boundary material is hard and smooth. Diffuse reflections will scatter according to a cosine law that I don't remember precisely off the top of my head (some function similar to (1-cos(theta))/2, or something of that nature), and happen when sound has less energy and/or the surface is very rough and/or soft and/or porous. If you've ever seen a time plot of a room impulse response, you can think of the early reflections as specular and the late reflections as diffuse. A sound field is considered diffuse when after excitation, you are equally probable to find energy at any point in time and space across the room. Both contribute to our perception of an auditory space: early reflections give us a sense of geometry of the room - the timing of reflections are a direct measure of mean-free (or, for simplicity's sake though this isn't precise, "average-shortest-path") distance that sound has to travel from a source, off of one or two walls, and eventually reach an ear; this gives us auditory information about how far away side walls and ceilings are from us. Diffuse reflections help to provide a sense of envelopment and arguably the size of a room. I should note, everything contained in this paragraph is VERY rough, up for debate in academic and sound engineering communities, and is not necessarily the word of law - rather, it's my understandings and beliefs of the terms in academics and in practice.

Yet another feature making this all the more complicated, is binaural perception. In hearing science, we VERY carefully measure human responses to stimuli with isolated reflections. This has led to the theory of the Precedence Effect, summing localization, echo thresholds, and so on. In short and taken with a HEAVY grain of salt, we SOMETIMES fuse the information from a reflection with its direct sound, if we receive that reflected information within a certain time window; this window changes depending on the source (speech, opera, classical music, rock music, etc; all have different thresholds). Outside of that window, we perceive the reflected sound as separable information and must process the direct and reflected sound separately. This only holds true for specular reflections; diffuse reflections are interpreted as confusing information and will lower the overall correlation. It's why I have a slight problem with your experiment of one ear vs both ears on headphones example above (though that by NO means invalidates most of your other points, many of which I agree with). There's really WAY too much literature out there regarding binaural perception of reflections and the precedence effect, so I'm mostly going to leave that on the table and if anyone has questions or wants to discuss further I'm happy to play ball.

I'm probably going to stop rambling here, as I was gearing up for a discussion about narrowband ITDs and ILDs yet recordings are broadband... it's too much to type out in addition to everything else I've mentioned above - maybe another day.

But the major point I want to add to this thread, is how it all relates to taping. Basically, from what I've seen and done on live tapes I've worked on, is that for live rock music it's best behind the board to point microphones at the stacks. Yes this reduces DFC, but mostly because you're maximizing direct sound and rejecting ALL reflections, not just diffuse sounds. This increases the correlation, not just for the diffuse field but also for early reflections.

EmRR · « **Reply #7 on:** November 30, 2017, 10:33:17 AM »

Quote from: Gutbucket on August 04, 2017, 11:23:30 AM

Here's a link to a page at Helmut Wittek's hauptmikrofon website with sound sample examples of the same source recorded using several different microphone setups- http://www.hauptmikrofon.de/audio/diffusefield.html.

Thanks for that link, rare to see good comparative samples like this. Mic and Room and Stereo Ambience are good technique examples also.

I mostly live in a recording studio (the occupants of which virtually never think about diffuse field recording), but also do live remotes which are typically authorized multitrack affairs, so mainly stage mic feeds plus whatever ambience I can add, last small club recording I did added XY KM140's at FOH, and mid-side Beyerdynamic M130/M160 ribbons as drum OH's, which typically aren't needed in that particular small room situation. M130/160 like that allow a lot of choices for ambient control in the stage blend. My thoughts on concert taping are stone-age and catching up, even with awareness of the equipment available nowadays, I still think of taping in the 80's with cassette decks, or the minidisc recorder I carried in the late '90's, no updated portable battery rig and the MD just keeps on working in a pinch. Attempting stealth audience recordings in the last 20 years, I eventually dropped back to mono, was never happy with the decorellated sound of the two DPA 4060's I was wearing at collar, was always fighting that in post, especially with body movement accounted for. I've done a few authorized acoustic house party remotes using a Samar MF65 ribbon set and a DPA 4060 as a horizontal surround B-format array placed up close, still well in the learning curve there. I've added a pair of MKH 30's recently that haven't seen action yet, looking forward to working with those.

Anyway, long rambling first post, thanks again for that link.

Gutbucket · « **Reply #8 on:** November 30, 2017, 11:56:18 AM »

Thanks for posting! Great contribution to the thread. Its really good to get feedback from someone well versed in the mathematical and technical aspects, areas in which I have little more than a layman's understanding.

Quote from: wforwumbo on November 29, 2017, 01:09:11 PM

The correlation of a left/right stereo signal will change depending on how "direct" or "diffuse" (more on these terms later, too) the recorded sound is. For more direct sound, the correlation peak will be tighter and more highly-peaked (closer to a dirac delta, or a Gaussian with very low variance), like a sharp mountain with high prominence; for more diffuse sound, it will look closer to a small hill, with a lower peak and the tails on either side being more spread out.

^
This aspect is perhaps the most relevant to tapers, and mostly what I've been focusing on.

Quote

Another quick note I want to make, regarding diffusion and room reflections. There are two major types of reflections in a room: specular and diffuse. Specular reflections are those that follow Snell's Law (angle of incidence = angle of reflection) when a wave is incident on a boundary. This will usually happen if the incident sound has lots of energy, and the boundary material is hard and smooth. Diffuse reflections will scatter according to a cosine law that I don't remember precisely off the top of my head (some function similar to (1-cos(theta))/2, or something of that nature), and happen when sound has less energy and/or the surface is very rough and/or soft and/or porous. If you've ever seen a time plot of a room impulse response, you can think of the early reflections as specular and the late reflections as diffuse. A sound field is considered diffuse when after excitation, you are equally probable to find energy at any point in time and space across the room. Both contribute to our perception of an auditory space: early reflections give us a sense of geometry of the room - the timing of reflections are a direct measure of mean-free (or, for simplicity's sake though this isn't precise, "average-shortest-path") distance that sound has to travel from a source, off of one or two walls, and eventually reach an ear; this gives us auditory information about how far away side walls and ceilings are from us. Diffuse reflections help to provide a sense of envelopment and arguably the size of a room. I should note, everything contained in this paragraph is VERY rough, up for debate in academic and sound engineering communities, and is not necessarily the word of law - rather, it's my understandings and beliefs of the terms in academics and in practice.

Not rigorously defined perhaps, yet essential elements applying all this to the perceptions generated from listening to an audio recording.

Quote

Yet another feature making this all the more complicated, is binaural perception. In hearing science, we VERY carefully measure human responses to stimuli with isolated reflections. This has led to the theory of the Precedence Effect, summing localization, echo thresholds, and so on. In short and taken with a HEAVY grain of salt, we SOMETIMES fuse the information from a reflection with its direct sound, if we receive that reflected information within a certain time window; this window changes depending on the source (speech, opera, classical music, rock music, etc; all have different thresholds). Outside of that window, we perceive the reflected sound as separable information and must process the direct and reflected sound separately. This only holds true for specular reflections; diffuse reflections are interpreted as confusing information and will lower the overall correlation. It's why I have a slight problem with your experiment of one ear vs both ears on headphones example above (though that by NO means invalidates most of your other points, many of which I agree with). There's really WAY too much literature out there regarding binaural perception of reflections and the precedence effect, so I'm mostly going to leave that on the table and if anyone has questions or wants to discuss further I'm happy to play ball.

Binaural perception is the critical second-half of the recording equation (playback), which of course is tied to the particulars of how we setup to make the recording as well.

A better way of doing the "one ear vs both ears with headphones" DFC demonstration would be to listen with both ears, summing the independent left and right channels to mono for the DFC=1 case. I suggested the "listening via one or both headphone cups method" simply because it is easy for anyone to do so as to get a practical sense of what all this is about, and unfortunately most software media players don't have easily accessible switching for that sort of thing. Note that this method would also be imperfect in that it would fully correlate all direct sound arrivals as well as the diffuse content, including early specular reflections. Fortunately those example files were chosen so as consist predominantly of diffuse sound pickup with very little direct sound or early arriving specular reflections and most everything heard in them is diffuse. So yes an imperfect example, and your comment about the importance of binaural perception is astute, yet I think the example illustrates what we're talking about for folks who might otherwise be baffled by all this technical talk.

Quote

But the major point I want to add to this thread, is how it all relates to taping. Basically, from what I've seen and done on live tapes I've worked on, is that for live rock music it's best behind the board to point microphones at the stacks. Yes this ~~reduces DFC~~ [my edit and comment here in bold- no, it increases DFC (diffuse pickup becomes more correlated not less as the angle between microphones is reduced) - a typo here I think], but mostly because you're maximizing direct sound and rejecting ALL reflections, not just diffuse sounds. This increases the correlation, not just for the diffuse field but also for early reflections.

Yes, from further back it is advantageous to point at stacks to maximize the direct/reverberant ratio as much as possible*. Yet we needn't accept overly reduced DFC at the same time. We can maintain low DFC by increasing the spacing between microphones as the angle between them is reduced. In that way, correlation of diffuse pickup (averaged over the entire sphere) is reduced, as is the correlation of all sources arriving from locations away from central medial plane. Granted, diffuse sound arriving anywhere along the approximate medial plane will not have that reduced correlation, in the same way that direct sound arriving along the medial plane will not, but that's only a fraction of the entire diffuse energy being picked up.

Good direct/reverberant ratio is the most important thing to my way of thinking. In that way it is of more fundamental importance than keeping DFC low, but I'd argue that low DFC is not far behind in the hierarchy of what makes for a good subjective listening experience with live music recordings.

*I try to avoid statements such as "rejecting ALL reflections", as first order microphones are simply not that directional when used at a significant distance from the source, and it's a common misconception to think of microphone pickup pattern as somewhat analogous to the field of view of a camera, cropping off everything outside of the frame. That's even more the case for a distant recording position, in which case a predominant portion of direct reflections arrive from directions not significantly different than the direct sounds themselves (from within the poorly analogous "cropped window").

wforwumbo · « **Reply #9 on:** December 01, 2017, 01:06:36 PM »

A semantical error on my behalf - I mean that pointing at stacks will be minimizing both specular and diffuse reflections. Though your point here (and most of your other points) still stand.

Either way, I'm excited this thread exists and will be contributing more in the near future as I keep chewing over your thoughts in the back of my head. Also more than happy to explain any concepts to people that have questions. I can also generate plots and short animations in matlab for anyone that wants to more directly see these concepts in a more concrete and easily-understood fashion.

Gutbucket · « **Reply #10 on:** December 05, 2017, 05:51:06 PM »

Quote from: wforwumbo on December 01, 2017, 01:06:36 PM

your point here (and most of your other points) still stand.

BTW, I very much welcome discussion of any of those which don't stand! In this thread and elsewhere on the site. We're all figuring this stuff out as we go and learning from each other here, which is partly what makes it a good community.

Author Topic: Examples of Diffuse Field Correlation (DFC) - hear what it's about (Read 4989 times)

Gutbucket

Examples of Diffuse Field Correlation (DFC) - hear what it's about

heathen

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

Gutbucket

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

bombdiggity

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

Gutbucket

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

Gutbucket

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

wforwumbo

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

EmRR

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

Gutbucket

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

wforwumbo

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about

Gutbucket

Re: Examples of Diffuse Field Correlation (DFC) - hear what it's about