Become a Site Supporter and Never see Ads again!

Author Topic: A.I. audio stem software  (Read 3527 times)

0 Members and 1 Guest are viewing this topic.

Offline phil_er_up

  • Trade Count: (9)
  • Taperssection All-Star
  • ****
  • Posts: 1301
A.I. audio stem software
« on: November 20, 2024, 09:44:47 AM »
 A.I. audio stem software.

There are several audio software's that take a audio wav file and create 4 stems - Vocals, Bass, Percussion and Other instruments such as Izotope RX Music rebalance and Ultimate Vocal Removal Tool.
Spectralayers  takes a audio wav file and create 7 stems - Vocals, Bass, Percussion, brass, piano, guitar and Other instruments.

This thread is for all A.I. audio stem software.


Izotope RX music rebalance. (RX MR) - You need to buy RX 11 standard which costs $399 now on sale for $200 at the sweetwater.

Sweetwater has some upgrade versions on sale.


Ultimate Vocal removal tool. (UVRT) - UVRT is Free though really needs a high end graphic card with lots of ram to run fast.


Spectralayers (SL) - Regular cost $299 onsale 11/20/2024 for $179.

SL has many more unmix options then RX or UVRT such as "unmix components" - Take drum layer and create more stems or bass drum, snare and cymbals. "Unmix crowd noise" - takes out crowd noise from a recording. You can create layers in SL and even take from another layer and add it to the layer you are working in. Very powerful.

Utube video of Beat Deconstruction Unmixing: (Video shows some of the power of SL)


From Music Radar: "iZotope RX 11 vs Steinberg SpectraLayers 11: which is the best spectral editor?"


Have used these stem software's for about a year now. Was mainly using RX music rebalance
as a tool to make a matrix recording of both aud mics and SBD feed. Would split the SBD feed
into Vocals, Bass, Percussion and Other instrument stems. Load SBD feed then audio mics then
each stem into a audio editor. You have 6 tracks to work with now instead of two. If vocals are
low in aud mics then can use the vocal SBD stem and increase the vocals volume. Or if bass is
not there you can use bass SBD stem and increase the volume. The possibilities here are
almost endless. Almost have to change the way you think about processing your audio files.

The RX MR software seems to split better with a SBD feed then splitting the audio mic source. It does
not do a bad job on audio mic source though it is not as clean as the SBD feed stems.


Does it work?

It sure does. Is it perfect - no. Though with vocals and bass seem to come out the best in RX MR
and drums can come out well though there is a bleed with some of the other instruments into the
drum stem. It the "Other instruments" stem where all the rest of the music is. If you have
guitar you want to increase there is no separate stem for that and you have to play with the
 "Other instruments" stem to get it to come out. Some people think there is a metallic sound
from the vocal stem in RX MR.

Ultimate Vocal removal tool (UVRT ) separates the vocals better then RX MR. There are model in
UVRT that will split the audio file into 4 stems. You can even separate vocals into lead vocals
and backup vocals. One disadvantage of UVRT  is it outputs 16 bit audio stems, not able to select
24 bit audio stem files. In RX MR you can get 24 bit audio stem files.


What kind of PC do you need to run these software?

A high end PC.

To process one 60 minute sbd feed with RX MR on my old windows 7 daw took 1 hour 17 minutes.
3-4 years ago bought a laptop windows 11 I7 with lots of ram and fast CPU with built in graphic card
with 2 GB ram took the 60 minutes SBD feed to 18 minutes. Just bought a Win 11 I7 32 GB ram and very
fast CPU with good graphic card with 8 GB memory. Now to process 60 minutes SBD feed to 8 minutes for RM MR.

If record 2 70 minute sets have 4 audio files per source. So have to run each of the 4 SBD files
with RX MR to get stems for the whole show. Now to process all 4 SBD files with my new PC is
less then 30 minutes. Have to do this before I master the show. This causes more work and processing.

With UVRT you really need a graphics card with lots of ram or the software runs very slowly. Can
process in UVRT a 60 minute sbd feed with new pc in 8 minutes though it is only 16 bit audio files
as mentioned above.

SL - unmix song - 6 layers: piano, bass, guitar, drum, vocal and other.
20 min song 24/96 wav file  - 6 stems  - 35 minutes processing time.


Where does this leave us with this new software?

If this software can really split the file into 4 or 7 stems then possibilities are endless. You could even take old SBD and add whatever is missing. This software will get better over time too.

Does anyone else have experience with this or like to chime in about it?


Link to article about different stem tools: 
« Last Edit: November 20, 2024, 04:09:53 PM by phil_er_up »
Everyday is a gift. Enjoy each one!
Forward motion bring positive results.

Offline phil_er_up

  • Trade Count: (9)
  • Taperssection All-Star
  • ****
  • Posts: 1301
Re: A.I. audio stem software
« Reply #1 on: November 20, 2024, 09:44:57 AM »
Some observations about SL Benchmarks and capabilities .

unmix song - 6 layers: piano, bass, guitar, drum, vocal and other.

20 min song 24/96 wav file - 6 stems 2 GB 24 bit each - 35 minutes to process

Piano layer - works well for electric not sure about acoustic
Bass Layer - works well full sounding layer with good mix from SBD feed
Guitar Layer - is there though seems to be a little cut off at certain frequency
Drums layer - kick, bass and hy-hat mostly covered. Cymbals cut off at very high end in other instrument layer
Other layer - weird artifacts from the other layers.
Voice layer - works well maybe some cut off freq from other layers.

The "other layer" contains some data from other layers and would be needed for the song to sound complete.


Unmix components:

60 min drum track 24/96 wav file - 50 minute to process

Creates 3 layers - tonal, transient and noise

kick Drum layer
tonal layer- snare drum
transient layer - cymbals

The bass drum layer picks up most of it though it is not completely clean as just kick drum. Snare layer
 is done pretty well though has artifacts from other layers.  Cymbals layer does a good job. Then you can
even use "Unmix componets" and separate the cymbals to hi-hat and rest of cymbals in another layer.


unmix crowd noise:

60 min SBD set 24/96 wav for unmix crowd noise to process 1h19m.

The unmix crowd noise did work though does pick up some of the instruments in the
crowd layer. In between songs it did a good job and took out most if not all
of the crowd noise. Some screams and clapping was taking out though it was
with the music too. A hand drum was mistaken as crowd noise.  Not sure this works
well enough to be effective without doing editing to put back in music or take out scream/whistles.


unmix chorus:

60 min SBD 24/96 wav set for unmix chorus to process 53m.

Took a previous vocal stem with lead singer and harmony by the band and
ran it with unmix chorus. Output layers lead and backup were created.
Lead vocal came out pretty well though sometimes it would get confused
with backup and lead and it would bleed through. Not sure this works well
enough to be effective without doing more editing to add whatever is missing
to another layer or the layer you are working in.

unmix song

9 min song 16 44 wav fareed - 2 acoustic guitar -  5 min to process

Only did 2 layers guitar and other. The guitar layer did not separate the 2
acoustic guitars and the other layer contained a fair amount of guitars when
they both guitars were playing loud.


unmix song

19 min song 16 44 wav flex band - 3 piece jazz combo - piano, bass and drums stems - took 20 minute to process.

Outputted 4 layers - other, piano, bass and drums. Bass layer had almost nothing
in it music wise. Drums were pretty well represented though high end seemed to be
cut off. Acoustic piano was well done though would bleed into the "others layer'.
Other layer contained music bleed from piano and drums.

spectralayers 11 commands

open wav file

module > unmix song

file > Export > Layers (creates .wav files for each layer open for the unmix options then import layers into DAW)
« Last Edit: November 20, 2024, 11:34:26 AM by phil_er_up »
Everyday is a gift. Enjoy each one!
Forward motion bring positive results.

Offline mccordo

  • Trade Count: (14)
  • Taperssection Member
  • ***
  • Posts: 645
  • Gender: Male
  • Area Man
Re: A.I. audio stem software
« Reply #2 on: November 20, 2024, 11:05:25 AM »
Thanks for posting this. I've been considering a software upgrade and this was just the type of info I needed to help make my decision. Looks like Izotope RX is the answer for me.
Mics: DPA 2012 Cards, DPA 4061-CORE Omnis, AKG ck63 Hypercards, AKG ck61 Cards, 2x AKG nBob Actives, 2x AKG C460B,MJE-384K Roadster (Michael Joly modded caps), Audix M1280 Hypercards
Pres: Grace Design Lunatec V2, SoundDevices MixPre, Edirol UA-5, Church Audio CA-9200, Naiant PIPsqueak
Recorders: SoundDevices MixPre-6, SoundDevices MixPre-3, 2x Tascam DR-100mkII, Zoom F3, Sony PCM-M10, Sony PCM-A10, Deity PR-2

Offline checht

  • Site Supporter
  • Trade Count: (7)
  • Taperssection Member
  • *
  • Posts: 840
  • Let's meet at alternate foods at the break
Re: A.I. audio stem software
« Reply #3 on: December 04, 2024, 03:45:56 PM »
OK, moved my post from another thread here:

I separate out vocals on most every recording. For SBD vocal feeds, it's a great way to debleed.
For aud 2 track recordings, I separate out vocals and mix them back in in parallel to add presences.
Really cures that 'distant' sound that plagues aud recordings. I find it especially helpful on recordings made with Neumann kmi84s from back in the day.

I've been using RX, currently on v9. Today I tried ultimate vocal remover and I'm shocked by how much better it works. Output is much cleaner intem of not clcutting off the beginning of phrases, and not including sax or guitar that would make it through rx. Also, when listening to it solo, the uvr track is drastically more musicall and natural sounding. To me, RX has always sounded kinda alien on its own.

uvr runs much faster than rx on my mac mini m3. And it's free.

What do others use stem separation for, what softwar do you use, and what are your thoughts?
Anyone else compare outputs?

Just uploaded samples to dropbox. There's the original sbd vocals feed, full of stage bleed. Then 2 configs of uvr, then rx music rebalance.
MK41s, MK22s; Vanguard V1s matched pair
Schoeps kcy5, nbob actives
Naiant PFA 60v, PFA 48v, IPA
Sound Devices MP-6II; Sony PCM-A10

Recordings at LMA

Online nulldogmas

  • Trade Count: (6)
  • Taperssection All-Star
  • ****
  • Posts: 1821
    • How I Escaped My Uncertain Fate
Re: A.I. audio stem software
« Reply #4 on: December 04, 2024, 06:02:55 PM »
What do others use stem separation for, what softwar do you use, and what are your thoughts?

I use stem separation mostly to rebalance different instruments/vocals, or very occasionally to apply EQ to one but not the others. (Say, if a kick drum is too loud but I don't want to reduce the bass guitar in the same frequency range.) I don't use it on most recordings, but maybe 20-30% of them?

I've tried both RX and UVR and agree that UVR does a better job isolating vocals, though they're both very good for most uses. (I use the MDX23C-InstVoc HQ setting, at Rob G's suggestion.) I haven't tried UVR for multi-stem separation yet — still waiting to hear reports on which settings people find work best.

In either case, I usually export the stems and then remix them in Audacity, where it's easier to play with the sliders on the fly.

Online jefflester

  • Trade Count: (2)
  • Taperssection All-Star
  • ****
  • Posts: 1681
  • Gender: Male
Re: A.I. audio stem software
« Reply #5 on: December 04, 2024, 06:28:50 PM »
I did a show this weekend with my band, individual instruments > F8, only output available from the board was an FX send and I got a lot more piano and acoustic guitars than vocals so I want to try UVR to pull out just the vocals. What should I use for "Overlap"? DLing the "MDX23C-InstVoc HQ" just now.
DPA4061 HEB -> R-09 / AT943 -> CA-UGLY -> R-09
AKG CK63 -> nBob actives -> Baby NBox -> R-09/DR2d
AKG CK63 -> AKG C460B -> Zoom F8/DR-680MKII
Line Audio CM4/Superlux S502/Samson C02/iSK Little Gem/Sennheiser E609/Shure SM57 -> Zoom F8/DR-680MKII (multitracked band recordings)

Offline robgronotte

  • Trade Count: (0)
  • Taperssection Member
  • ***
  • Posts: 369
Re: A.I. audio stem software
« Reply #6 on: December 05, 2024, 04:08:48 AM »
I did a show this weekend with my band, individual instruments > F8, only output available from the board was an FX send and I got a lot more piano and acoustic guitars than vocals so I want to try UVR to pull out just the vocals. What should I use for "Overlap"? DLing the "MDX23C-InstVoc HQ" just now.

I use UVR5 often, mostly to remove crowd noise from instrumental portions of songs.  I clip out the portion I want to clean up, run it through UVR5, and then patch back in the clean option.  As noted above, the best one I have found is "MDX23C-InstVoc HQ", which has to be downloaded separately (also free and very easy).
I don't know what the "Segment Size" and "Overlap" options refer to, so I have just left them at the default, which was 256 and 8.  If anyone knows more about options on UVR5 I would love to understand it better.

Actually I almost always run in "Ensemble Mode" [with Max Spec/Min Spec setting] which runs the file through several different filters at the same time.  In addition to the one mentioned above, I use "VR Arc1_HP-UVR" and "Demucs v4|htdemucs" (both included in main download).  It barely takes any longer than using the MDX23C alone, as that one is very slow compared with most of the algorithms.   I get 4 outputs - one for each of the 3 processes, plus a combined version.  If the MDX23C version doesn't sound perfect to me, I will listen to the others and possibly use one of those instead.

If anyone else wants to use UVR5 for similar results, feel free to ask any questions and I could give some more detailed info about what I do and the results I get.

Offline phil_er_up

  • Trade Count: (9)
  • Taperssection All-Star
  • ****
  • Posts: 1301
Re: A.I. audio stem software
« Reply #7 on: December 05, 2024, 06:07:41 PM »
Was not sure if this thread would gain any traction so have not posted any additional info. No expert in this just posting obervations/thoughts. Will not say anything about the below files till other listen and give their opinion.
(Links good for a week)



1) 2 song - 24/96 sbd wav files and created 4 stems - vocal, bass, drum and "other instruments" in RX, UVR and Steinberg Spectral Layers (SL).

2) Linked original 24/96 wav file.

3) Created 6 stems of the same original file in SL (6 stems Vocal, bass, drum, other, piano and guitar) for others to compare how it splits the piano and guitar into separate stems.

Did no processing except take the 16 bit UVR and created 24/96 files so could compare them in my daw. Created flacs from the wavs to conserve file space.   


 2 songs - 24/96 sbd wav files and created 4 stems - vocal, bass, drum and "other instruments" in RX, UVR and Steinberg Spectral Layers (SL)

Original 24/96 wav file -

RX - 4 stems - vocal, bass, drum and "other instruments" -3 min 9 seconds to create 4 stems

UVR- 4 stems - vocal, bass, drum and "other instruments - 2 min 52 seconds to create 4 stems (Used "htdemucs" model to process the 4 stems)

SL - 4 stems -  vocal, bass, drum and "other instruments - 8 minutes to create 4 stems


SL - 6 stems Vocal, bass, drum, other, piano and guitar. - 8 minutes to create 6 stems


Any comments?
« Last Edit: December 06, 2024, 11:18:06 AM by phil_er_up »
Everyday is a gift. Enjoy each one!
Forward motion bring positive results.

Offline robgronotte

  • Trade Count: (0)
  • Taperssection Member
  • ***
  • Posts: 369
Re: A.I. audio stem software
« Reply #8 on: December 05, 2024, 06:58:18 PM »
Phil, have you tried the Spectralayers function of separation of different vocals?

I tried it once but didn't understand how I was supposed to "train" it to learn the main vocalist.
Was hoping it could cut out audience chat over the singing.

Offline phil_er_up

  • Trade Count: (9)
  • Taperssection All-Star
  • ****
  • Posts: 1301
Re: A.I. audio stem software
« Reply #9 on: December 06, 2024, 06:35:40 AM »
Phil, have you tried the Spectralayers function of separation of different vocals?

I tried it once but didn't understand how I was supposed to "train" it to learn the main vocalist.
Was hoping it could cut out audience chat over the singing.
If I understand what you want to do correctly...then suggest doing the following procedure in SL.


Create vocal stem and open it in SL.

Select "unmix Multiple Voices" in the modules.

Use cursor and Highlight a 20 second piece or so for main vocalist in the vocal stem
Then in the  "unmix Multiple Voices" window Click "Register Voice" - In SL it creates "voice 1"  in the "unmix Multiple Voices" window.

Now you can select the second vocalist and do the same as above and the then this will create a "Voice 2" for the second vocalist.

Now click "apply" in the  "unmix Multiple Voices" window and SL creates layers for "Voice 1", "Voice 2", "Non_voice" and  "Non-Un-mixed".

File > Export > Layers

Hope you can understand what I wrote.


SpectraLayers 11 vs Ultimate Vocal Remover:
« Last Edit: December 06, 2024, 11:23:37 AM by phil_er_up »
Everyday is a gift. Enjoy each one!
Forward motion bring positive results.

Offline robgronotte

  • Trade Count: (0)
  • Taperssection Member
  • ***
  • Posts: 369
Re: A.I. audio stem software
« Reply #10 on: December 06, 2024, 12:58:11 PM »
So you need to have a portion with each vocalist singing alone in order to use the function?
That seems not very useful, as you would rarely have that available for the backing vocals.

Offline phil_er_up

  • Trade Count: (9)
  • Taperssection All-Star
  • ****
  • Posts: 1301
Re: A.I. audio stem software
« Reply #11 on: December 06, 2024, 01:06:18 PM »
So you need to have a portion with each vocalist singing alone in order to use the function?
That seems not very useful, as you would rarely have that available for the backing vocals.
Yes you do.

There is "unmix chorus" too. That splits main singer from chorus.
Everyday is a gift. Enjoy each one!
Forward motion bring positive results.

Offline robgronotte

  • Trade Count: (0)
  • Taperssection Member
  • ***
  • Posts: 369
Re: A.I. audio stem software
« Reply #12 on: December 06, 2024, 02:26:54 PM »
So you need to have a portion with each vocalist singing alone in order to use the function?
That seems not very useful, as you would rarely have that available for the backing vocals.
Yes you do.

There is "unmix chorus" too. That splits main singer from chorus.

How does that work?

Offline phil_er_up

  • Trade Count: (9)
  • Taperssection All-Star
  • ****
  • Posts: 1301
Re: A.I. audio stem software
« Reply #13 on: December 07, 2024, 07:18:09 AM »
So you need to have a portion with each vocalist singing alone in order to use the function?
That seems not very useful, as you would rarely have that available for the backing vocals.
Yes you do.

There is "unmix chorus" too. That splits main singer from chorus.

How does that work?
From what I have seen so far:

unmix Multiple Voices - is to separate 2 or more vocalist to have one vocal layer for each singer.

Unmix chorus -  Output layers lead and backup are created. Seems this would be good choice if had one main singer and then harmony with rest of band.


unmix chorus:

60 min SBD 24/96 wav set for unmix chorus to process 53m.

Took a previous vocal stem with lead singer and harmony by the band and
ran it with unmix chorus. Output layers lead and backup were created.

Lead vocal came out pretty well though sometimes it would get confused
with backup and lead and it would bleed through.


The wetransfer links above end tomorrow. Hope someone DL'ed them and has some comments. Gives a good idea of what this software can do.

« Last Edit: December 07, 2024, 07:26:14 AM by phil_er_up »
Everyday is a gift. Enjoy each one!
Forward motion bring positive results.

Offline ballerusk

  • Trade Count: (3)
  • Taperssection Member
  • ***
  • Posts: 257
  • Soft spot for the sweet spot
Re: A.I. audio stem software
« Reply #14 on: December 18, 2024, 04:06:15 PM »
Hope they have a new year sale or something on Spectralayers. That unmix crowd feature seems golden for audience mumbling chatter.
Schoeps MK41s > Schoeps CMRs > Naiant Tinybox > Sony PCM-M10


RSS | Mobile
Page created in 0.064 seconds with 43 queries.
© 2002-2025
Powered by SMF