Become a Site Supporter and Never see Ads again!

Author Topic: Machine Learning (AI) to quiet talkers  (Read 754 times)

0 Members and 1 Guest are viewing this topic.

Offline randy3732

  • (0)
  • Taperssection Newbie
  • *
  • Posts: 2
Machine Learning (AI) to quiet talkers
« on: May 25, 2021, 10:42:23 PM »
Has anyone used or heard of any program(s) or online services to quiet talkers (people talking) that ruin live recordings?

I've successfully used Machine Learning video noise reduction and resolution enhancement and was thinking Machine Learning could be used to suppress the voices of people talking while a band is playing. There are a few programs and online services that claim to separate drums, bass, guitars, and vocals, but none I found focus on keeping all the music and quieting non-musical sounds such as talkers.

Things I'd be concerned with:
Keeping the music and vocals the same.
Keeping the ambiance the same.
Not sounding processed or robotic.
Only quieting the people talking.

I have at least 50 recordings I'd like to release where despite my best efforts of recording in front of a PA, some people nearby started talking loudly and they got recorded.

I would be willing to pay $100 for every dB of talker noise reduction. Surely there's a smart group of designers that could make it possible.

Randy3732

Offline opsopcopolis

  • (3)
  • Taperssection All-Star
  • ****
  • Posts: 1779
Re: Machine Learning (AI) to quiet talkers
« Reply #1 on: May 26, 2021, 03:23:58 PM »
I'm sure somebody much smarter than me will have a better answer, but my understanding is that what you're asking for would be very difficult for a few reasons. First off, from my understanding of the tech that is used in those AI instrument separators, it uses a combination of frequency/spatial separation to capture a sort of 'image' of each instrument which is then used to separate out each instrument. For general noise reduction, a similar process is used to capture an 'image' of the noise, often requiring a pretty good section of relative silence to capture successfully. In both cases, you need pretty consistent and clear 'images' of the noise you're trying to separate out, which isn't really available in this scenario.
Mics: Berliner CM-33, CA-14 card, CA-11 card & omni, AT-853, Sony ECM-907
Recorders: Tascam DR-60D, Tascam DR-05, Sony Hi-MD

Offline hoserama

  • (1)
  • Taperssection Member
  • ***
  • Posts: 351
  • Gender: Male
Re: Machine Learning (AI) to quiet talkers
« Reply #2 on: May 26, 2021, 04:56:18 PM »
You could to a mix of AI plus good old fashioned spectral repair.

Split the audio into the stems and then do spectral repair. I imagine the vocals/drums/bass algorithms won't pick up the chatter (although the vocals might). Then you do spectral repair on the remainder, then reintegrate together.

That way if you do spectral repair, you're leaving alone components. Transients like drums would remain relatively unaffected.
Audio: Countryman B3 + AT853(hypers/cards/subcards) + SBD feeds
Wireless Receivers: Lots of those
Antennas: Lots of those
Cables: Lots of those
Recorders: Cymatic Utrack24, Cymatic LR16, RME Multiface, Zoom F8, (3) Tascam 680, (2) Tascam 2D, Zoom H6, Zoom H4n, and a graveyard of irivers/nomads/minidiscs.

Offline Gutbucket

  • record > listen > revise technique
  • (15)
  • Needs to get out more...
  • *****
  • Posts: 14228
  • Gender: Male
  • Gunther Theile nailed it!
Re: Machine Learning (AI) to quiet talkers
« Reply #3 on: May 26, 2021, 07:14:58 PM »
Moore's law marches ever onward and many compromised recordings await. Mark my words. We'll get there, just not quite yet. Look at what we can do know that was unimaginable a couple decades back. Yet even once we do, its generally good advice to not buy the first model year of any new car.

musical volition > vibrations > voltages > numeric values > voltages > vibrations> virtual teleportation time-machine experience
Better recording made easy - >>Improved PAS table<< | Made excellent- >>click here to download the Oddball Microphone Technique illustrated PDF booklet<< (note: This is a 1st draft, now several years old and in need of revision!  Stay tuned)

Offline Numpy

  • + pace, amore e felicita +
  • (0)
  • Needs to get out more...
  • *****
  • Posts: 13870
  • Gender: Male
  • I support the Sierra Club, but don't represent it.
    • Oceana North America
Re: Machine Learning (AI) to quiet talkers
« Reply #4 on: May 27, 2021, 07:29:22 PM »
Tasers are illegal in NJ ...
Shock collars, however....     8)
"Peace is for everyone"
        - Norah Jones

"Music is the drug that won't kill you"
         - Fran Lebowitz

Offline hoserama

  • (1)
  • Taperssection Member
  • ***
  • Posts: 351
  • Gender: Male
Re: Machine Learning (AI) to quiet talkers
« Reply #5 on: May 28, 2021, 10:29:21 AM »
Tasers are illegal in NJ ...
Shock collars, however....     8)

The high pitched YIP when you shock somebody is easy to remove via Izotope RX, but the broadband bzzzzzzz as the electricity is going through them is a bit more challenging.
Audio: Countryman B3 + AT853(hypers/cards/subcards) + SBD feeds
Wireless Receivers: Lots of those
Antennas: Lots of those
Cables: Lots of those
Recorders: Cymatic Utrack24, Cymatic LR16, RME Multiface, Zoom F8, (3) Tascam 680, (2) Tascam 2D, Zoom H6, Zoom H4n, and a graveyard of irivers/nomads/minidiscs.

Offline checht

  • (5)
  • Taperssection Member
  • ***
  • Posts: 412
  • Gender: Male
  • Old and in the Way
Re: Machine Learning (AI) to quiet talkers
« Reply #6 on: May 29, 2021, 03:08:09 PM »
You could to a mix of AI plus good old fashioned spectral repair.

Split the audio into the stems and then do spectral repair. I imagine the vocals/drums/bass algorithms won't pick up the chatter (although the vocals might). Then you do spectral repair on the remainder, then reintegrate together.

That way if you do spectral repair, you're leaving alone components. Transients like drums would remain relatively unaffected.

This is my current workflow using RX-8. Split out vocal stem, use spectral repair on it, mix back in. Easy to remove yells and whistles, but talking is really tough.

Additional benefit is that it makes it easy to bring up the vocal on a recording that sounds distant. My 80's km84i > D5 recordings are improved by a bit of this.

At the same time, best way to quiet talkers is to not let talking get on the recording. Currently mastering The Band 12/31/83 opener for the Dead, recorded from the taper section, and there's not one talk/yell/whistle on the whole thing. Really hitting me how different things were back then. Sigh.
Schoeps MK41s > nbob KCY >
Naiant PFA 60v > Sound Devices MP-6 II  or  Naiant IPA > Roland R-07
Recordings at LMA: https://archive.org/search.php?query=subject%3A%22Chris+Hecht%22&sort=-date

Offline randy3732

  • (0)
  • Taperssection Newbie
  • *
  • Posts: 2
Re: Machine Learning (AI) to quiet talkers
« Reply #7 on: May 30, 2021, 04:08:00 AM »
Spectral repair works well to quiet feedback and the occasional too loud "YEA!!!". The recordings I've tried it on for talkers, I can't even see the talkers in the spectrum.

Thank you all for the comments. Hopefully I've planted a seed for someone really smart to figure out an AI solution.

Offline wforwumbo

  • (6)
  • Taperssection Regular
  • **
  • Posts: 141
Re: Machine Learning (AI) to quiet talkers
« Reply #8 on: June 03, 2021, 04:24:56 PM »
I did my doctoral thesis on a related topic - AI for reflection identification, treating the reflections as noise.

It’s a complex problem. Even if you try to remove things in the spectral domain (common with iZotope), you need some decent model of the noise (in this case, a talker) and noise is highly varied.

Video and image processing is a different domain altogether. For one thing, it has a lot more funding generally thrown at it and is more detailed in its study. For another, light and sound operate very differently - especially in the digital domain. I’ve spoken extensively about some of these differences here, I really wish the answer were as simple as “the same concepts in vision apply to audio” but they really do not.

The tech may perhaps get there someday, but I’m under 30 and it’s unlikely to happen in my lifetime.
North Jersey native, Upstate veteran, Bay Area resident

2x Schoeps mk2
2x Schoeps mk21
2x Schoeps mk4
1x Schoeps ccm8

2x Schoeps cmc5
Nbob KCY
Naiant PFA

Sound Devices Mixpre-6

Offline Gutbucket

  • record > listen > revise technique
  • (15)
  • Needs to get out more...
  • *****
  • Posts: 14228
  • Gender: Male
  • Gunther Theile nailed it!
Re: Machine Learning (AI) to quiet talkers
« Reply #9 on: June 03, 2021, 06:08:20 PM »
I think a reasonable approach will be similar to a judicial application of noise reduction.  If we expect to achieve absolute elimination of the problem we're just setting ourselves up for disappointment.  But I suspect a multi-pronged approach will lead to reasonably beneficial results and incremental improvements over time.

It may happen partly by a focus on identifying the noise so as to isolate and reduce it, as problematic as that is for a difficult to define variable noise signal as wforwumbo describes from a high-level of expertise..  And partly, and probably more fruitfully in the near-term I suspect, by identifying, extracting and amplifying the desired signal using an approach like that of the Music Rebalance function in Izoptope R8 cheht describes using for isolating and enhancing vocals.  Take something like that and do a multiple stem extraction of all desirable elements, perhaps including a baseline isolated ambient/reverberant stem.  That should all be super useful yet may be somewhat over-isolated and artificial sounding, so use the original recording including the talking noise as a bed and seed it with the extracted stem elements, balancing it all to best effect and achieving an increase in desired signal to unwanted noise.
I think a significantly useful reduction in chatter could be achieved that way.  Especially as such Rebalance-like signal extraction tools continue to improve. 

Its just the conceptual difference of identifying and amplifying all desired signals, if doing that is easier, than identifying the problematic noise signal.

With regards to identification and isolation of the talking noise, I see specific talkers (relatively easily identified as such by a human listener as having diction from a specific location) as being a different, although related issue to the general conversational din and murmur of massed talking, which effectively arrives diffusely from all directions.
musical volition > vibrations > voltages > numeric values > voltages > vibrations> virtual teleportation time-machine experience
Better recording made easy - >>Improved PAS table<< | Made excellent- >>click here to download the Oddball Microphone Technique illustrated PDF booklet<< (note: This is a 1st draft, now several years old and in need of revision!  Stay tuned)

 

RSS | Mobile
Page created in 0.045 seconds with 34 queries.
© 2002-2021 Taperssection.com
Powered by SMF