Taperssection.com
Gear / Technical Help => Recording Gear => Topic started by: TheJez on September 23, 2024, 05:37:30 AM
-
Summary: Could the Roland R-07 be a multi-ADC 32-bit fp recorder in disguise?
Sorry guys, I must admit: The title of this message is a bit of a click bait. At first I wanted to put a question mark at the end, but then I thought: “Anybody seeing that will think it’s a post from someone too lazy to read the spec sheet of the R-07 and won’t even bother to further read this post.” I am hoping to get interest from people knowing/owning the Roland R-07 and people interested in 32-bit fp recording…
Some background: Since a little while I am looking for a replacement of my Edirol R-09HR. When going through all this information on TS I concluded that 32bit-fp multi-adc is the way to go for my needs. So when someone on TS very kindly offered me a Roland R-07 for a very reasonable price, I still turned him down “because I want to go 32bit”… However, this offer made me think…
A couple of years ago I read about the R-07 and about its safety track, recorded at a lower volume, as a backup without distortion if the record level was set too high. That sounded like a very useful feature! On November 29, 2019, someone wrote on TS about this device: “I think the dual recording levels is far more useful than remote app for stealth conditions. I now go in with the levels pre-set and never look at the device, or the app, or worry about levels at all.” And also: “With the dual recording I just set it to 48 and then forget” Hmmm, sounds familiar? I’ve often seen very similar remarks while reading up on 32-bit multi-adc recording. Set-and-forget, how great would that be!
And now I started wondering about the signal path of this safety track on the R-07. So I went through all 39 pages on TS about the R-07 to see if there’s any info about it. All I found was someone claiming on December 19, 2018: “Metering and dual recording happen post-ADC.”. Would it really be true that dual recording is completely post-ADC? Personally, I find that hard to believe when looking at the possible sources of clipping distortion due to too high record level in this recorder. I guess there are two potential places where clipping could occur:
1) At the analog input stage ((amplified) input signal too hot to handle by the analog components)
2) In the ADC (signal coming into the ADC too hot, resulting in clipping samples)
(I can’t think of a probable, realistic cause of clipping post-adc in this device)
Now if the signal is clipping, either due to 1) or 2), then reducing the level post-ADC by e.g. -20dB and then storing that as a safety track will not solve the clipping! So it means there must be two ADC’s: One for the normal track, one for the safety track. And possibly, there will even be two analog stages! This would result in the following signal paths:
(see attached picture)
There would be two identical paths, with only a difference in the applied gain in the input stage. For the normal track, the gain 'x' as set by the user is applied. As we know, in 2xWAV mode, this is a value between 42-100. For the safety track, a gain of ‘x-42’ is applied. (If I’m correct, in the latest firmware, the offset of the safety track can be selected from -6/-12/-20db. I guess ‘-42’ would then match a -20db reduction…)
Now if these assumptions about the signal path would be correct, then we have a device very similar to the current 32-bit devices. The only thing missing is a piece of logic/signal processing that’s taking the samples coming out of the two 24-bit ADC’s and combining this into a 32-bit float sample. I know this part is the real ‘heart’ of 32-bit float multi-adc recorder, but… is this something that really has to be done in real-time within the device?? I’d say no, not necessarily… As long as you store the output of the two ADC’s, you could combine these into 32-bit float anytime you want!
This missing piece of logic could very well be implemented as a software program running on a PC. The required algorithm doesn’t seem extremely complex, imho…
So what do you think? Could this potentially work? Where did I go wrong in my thinking process? Hope to hear your thoughts…
-
The 32-bit floating point recorders split the incoming signal into two (or more) gain-ranged streams which have different gains applied before conversion.
-
The 32-bit floating point recorders split the incoming signal into two (or more) gain-ranged streams which have different gains applied before conversion.
Exactly! But isn’t the R-07 doing the exact same thing, but then to get the normal file and the safety file?
-
No. The R07 doesn't split the signal into two parts. It sends the whole signal down both paths.
-
No. The R07 doesn't split the signal into two parts. It sends the whole signal down both paths.
Then I think that, even after reading all the threads about 32 bit multi-adc recording, I still seem to miss some details about the technology behind it. Wouldn’t splitting the signal as you describe introduce all kinds of frequency components that are not part of the original signal? Isn’t the full signal fed into two adc’s (although at different levels) and isn’t then the output of the adc’s combined by taking e.g. a weighted average of the two samples, where the applied weight depends on the actual signal strength?
Hopefully you can elaborate a bit on the splitting mechanism you mentioned to help me understand…
-
The R07 doesn't have the 32 bit float gimmick, but the safety track is in the real world the same thing. 20db is a LOT of free headroom. And it doesn't have the theoretical problem people were complaining about last year, where there's noise at the transition point between the 2 adcs in the 32 bit recorders.
-
The R-07 is my go to for all stealth recordings now, and I think it might be my quote about “set and forget” which is what I do and have never had a failed recording with this unit.
It’s small, made of plastic, so easy to get into any venue whatever the security. I have the safety track set at -12dB. I have had the main tracks clipping on a couple of occasions, but found the safety tracks to be good.
I don’t even use a battery box these days, just run my CA mics straight in. I’ve just checked the PIP voltage and it’s 3.1v.
I had high hopes for the Deity PR-2 as a set and forget unit, but since it isn’t 32-bit stereo as promised the R-07 seems to be the best solution out there for me.
-
The R07 doesn't have the 32 bit float gimmick, but the safety track is in the real world the same thing. 20db is a LOT of free headroom. And it doesn't have the theoretical problem people were complaining about last year, where there's noise at the transition point between the 2 adcs in the 32 bit recorders.
Emphasis on theoretical.
-
The R07 doesn't have the 32 bit float gimmick, but the safety track is in the real world the same thing. 20db is a LOT of free headroom. And it doesn't have the theoretical problem people were complaining about last year, where there's noise at the transition point between the 2 adcs in the 32 bit recorders.
You might call it a gimmick, but for people who record shows where you get one shot at the recording, no re-do's, and often have no ability to set levels in advance, I'd call it an improvement or upgrade. You are entitled to your opinion though. It is not the same as a safety track at all, the safety track often has more self noise. I consider the Roland R05 and R07 2 excellent decks for recording, but they are not the same in any way as a 32Bit deck. I have not had any issues ever with transition noise on a 32Bit deck. I have been using the MixPre 6 II 32 Bit deck since 2019, no transition noise issues at all, and the 32Bit float market has grown from there. Necessary sometimes, and useful most of the time. It is in no way a step backwards or sideways. I still often record at 24Bit depending on the deck, venue and such, but I think slowly that will happen less and less as more new decks hit the market. I was happy to retire my 16 bit Nomad Jukebox II 15 or 20 years ago for my 24Bit Tascam HDP-2, also still a great deck and that in no way means I could not get great recordings with either of those decks. However 24Bit was an improvement and 24 Bit decks over time got smaller and easier to use. I started recording on a Concord cassette deck with an included pencil mic in 1971, so I have seen some change which I am always happy to embrace.
-
You might call it a gimmick, but for people who record shows where you get one shot at the recording, no re-do's, and often have no ability to set levels in advance, I'd call it an improvement or upgrade. You are entitled to your opinion though. It is not the same as a safety track at all, the safety track often has more self noise.
That's been my experience as well. The Roland r07's safety track was often very noisy, lots of hiss. It also couldn't handle line-level inputs without an attenuator. Give me a Zoom F3 with 32-bit any day.
-
You might call it a gimmick, but for people who record shows where you get one shot at the recording, no re-do's, and often have no ability to set levels in advance, I'd call it an improvement or upgrade. You are entitled to your opinion though. It is not the same as a safety track at all, the safety track often has more self noise.
That's been my experience as well. The Roland r07's safety track was often very noisy, lots of hiss. It also couldn't handle line-level inputs without an attenuator. Give me a Zoom F3 with 32-bit any day.
Maybe I'm wrong, but it feels a bit easy to bash the safety track without giving any context. Was the record level set rather low in the first place? Yes, then obviously the safety track will be noisier. And indeed, if the input signal is too hot for the analog input part, then both normal and safety track will likely distord. And yes, the F3 analog input stage can handle a hotter signal. But that's not my point.
I guess R07-owners will set the record levels in such a way that the main track will likely not clip. And if it does after all, they can use the safety track instead. Or maybe they start mixing the main track and the safety track, so only the clipping parts will be replaced with the safety track parts. Anyway, this sounds like a tedious job, and will result in different noise floors for the main track and safety track parts.
In my proposal, it doesn't matter much if the main track clips. It is even encouraged to make it clip a bit to fully profit from the dual-ADC of the R07! Eventually, for these clipping main track samples, we can use the safety track samples. Maybe the attached picture will help understanding my intention.
The R07 has two 24 bits ADC's. Each has a theoretical dynamic range of 144dB. Let's assume the safety track ADC will process the signal attentuated by -20dB. Then 'my algorithm' will look like this:
- Read a 24 bit sample from the main track file and the safety track file
- Convert the samples from both files to 32bit floating point. This is done to extend the possible dynamic range and to prevent introducing errors further down the line while doing math on the samples
- Amplify the safety track sample by 20dB, so it becomes 'equal' to the main track sample.
- If the main track sample < -124dB, the output sample will be the main track sample (Area A)
- Else if the main track sample is between 0dB and -124dB (Area B), we'll calculate a weighted output sample. E.g. when main track sample is -124dB, the output sample is (100% of main track sample + 0% of the amplified safety track sample). When the main track sample is just below 0dB (so just not clipped), then the output sample is (0% of main track sample + 100% of the amplified safety track sample). But half way, at main track sample is -62dB, the output sample is (50% of main track sample + 50% of the amplified safety track sample), etc.
- Else (so the main track sample is clipped and the amplified safety track sample is >= 0dB (Area D)), use the amplified safety track sample as output
This way, the problem of manually combining main and safety tracks, and dealing with different noise floors for the two tracks is mitigated. And also we get plenty more headroom and don't have to worry too much about setting the record level right. Let it clip, I would say!
I guess there could be some optimizations, e.g. automatically calculating the true level difference between the two tracks (as it won't be exactly 20.0000000 dB), or changing the size of the area where the weighted value is calculated to more realistic boundaries instead of the theoretical boundaries. E.g. I can imagine that using the recorder noise floor as lower boundary of the 'Area B' would make more sense.
Is this still a stupid idea, or is it making sense? I guess the F3 and friends are using similar algorithms to combine the output of the ADC's, but they do it in real time instead of 'offline'...
-
You might call it a gimmick, but for people who record shows where you get one shot at the recording, no re-do's, and often have no ability to set levels in advance, I'd call it an improvement or upgrade. You are entitled to your opinion though. It is not the same as a safety track at all, the safety track often has more self noise.
That's been my experience as well. The Roland r07's safety track was often very noisy, lots of hiss. It also couldn't handle line-level inputs without an attenuator. Give me a Zoom F3 with 32-bit any day.
Maybe I'm wrong, but it feels a bit easy to bash the safety track without giving any context.
Likewise, I think it's easy to praise a hypothetical while ignoring real world experience. The main track clipped in situations that had high dynamic range, so I'd default to the safety track, which was full of hiss. It was a mess of a recording that needing a lot of post work to get listenable. Glad to have the safety track, but I'd rather skip all that.
-
You might call it a gimmick, but for people who record shows where you get one shot at the recording, no re-do's, and often have no ability to set levels in advance, I'd call it an improvement or upgrade. You are entitled to your opinion though. It is not the same as a safety track at all, the safety track often has more self noise.
That's been my experience as well. The Roland r07's safety track was often very noisy, lots of hiss. It also couldn't handle line-level inputs without an attenuator. Give me a Zoom F3 with 32-bit any day.
Maybe I'm wrong, but it feels a bit easy to bash the safety track without giving any context.
Likewise, I think it's easy to praise a hypothetical while ignoring real world experience. The main track clipped in situations that had high dynamic range, so I'd default to the safety track, which was full of hiss. It was a mess of a recording that needing a lot of post work to get listenable. Glad to have the safety track, but I'd rather skip all that.
I didn't mean to disrespect anybody, and I'm sorry if I came across like that. High dynamic situations are a pain to deal with regarding setting the levels right. I have similar eperiences whith high dynamic shows (and no safety track feature on the recorder) where I was so scared to set the levels too high that I ended up with a recording with lots of noise in the quiet parts. I think in such a situation you could have had benefit from a little piece of software as described that could automatically merge the two tracks into one in a smart way, instead of manually merging or having to opt for the safety track that is noisy in the quiet parts.
-
You might call it a gimmick, but for people who record shows where you get one shot at the recording, no re-do's, and often have no ability to set levels in advance, I'd call it an improvement or upgrade. You are entitled to your opinion though. It is not the same as a safety track at all, the safety track often has more self noise.
That's been my experience as well. The Roland r07's safety track was often very noisy, lots of hiss. It also couldn't handle line-level inputs without an attenuator. Give me a Zoom F3 with 32-bit any day.
Maybe I'm wrong, but it feels a bit easy to bash the safety track without giving any context.
Likewise, I think it's easy to praise a hypothetical while ignoring real world experience. The main track clipped in situations that had high dynamic range, so I'd default to the safety track, which was full of hiss. It was a mess of a recording that needing a lot of post work to get listenable. Glad to have the safety track, but I'd rather skip all that.
I didn't mean to disrespect anybody, and I'm sorry if I came across like that. High dynamic situations are a pain to deal with regarding setting the levels right. I have similar eperiences whith high dynamic shows (and no safety track feature on the recorder) where I was so scared to set the levels too high that I ended up with a recording with lots of noise in the quiet parts. I think in such a situation you could have had benefit from a little piece of software as described that could automatically merge the two tracks into one in a smart way, instead of manually merging or having to opt for the safety track that is noisy in the quiet parts.
I think your idea is interesting, but it hinges on two points:
(1) Are the R-07 preamps as good as the Zoom F3's? Your idea is all about optimising gain staging with the R-07, but your results will still be worse if the F3 provides cleaner gain. You could use a cleaner external pre-amp like most of us have done in the past, but then you add to bulk.
(2) How much work would it take to write a piece of software that accomplished what you're proposing automatically? Because doing it all manually would be a huge pain - not worth it at all unless you absolutely had to. If you are willing to actually write the software, though, it's a good idea!
-
For me, I enjoy that we often have varying opinions on just about everything. That is good, it helps us learn and grow and figure our own way.
This entire thread is interesting. I think the R-07 (and R05 too) are some of the best portable 24Bit decks on the market. Personally, I only used the safety track once, and I did not care for it, but it's a great feature for many in many instances.
I also will still record at 24 bit when appropriate for me. Some >:D situations especially. That is happening less and less, because more small 32bit decks are hitting the market and I really like not worrying about levels. When I tape live shows, I usually do not know what the sound level will be until the first notes are played. If I am off, can I fix it? Yes, but it is very nice to not even have to worry.
I am not a "spec" guy, I am an "ears" guy. I like what I like and pretty much respect anyone's opinion, but I will continue to evolve at my own pace. I started in 1971, and still learn all the time.
As I said, this is an interesting thread that shows how we differ and how we are alike.
Do I think an R-07 can become 32Bit? I do not, but I would be happy to be proved wrong.
-
(1) Are the R-07 preamps as good as the Zoom F3's? Your idea is all about optimising gain staging with the R-07, but your results will still be worse if the F3 provides cleaner gain. You could use a cleaner external pre-amp like most of us have done in the past, but then you add to bulk.
Obviously the F3 preamps are substantially better. And no matter what processing you do after the ADC's: It will not improve the preamps. However, what I think the software program would do is 'helping getting the best out of your R07'. Until now the 'safety track' has been considered merely as a backup in case you set the levels too high (probably helped by the name that was put on this track by the Roland marketing department). My idea is to consider it as 'just a 2nd ADC' like is implemented on the 32bit recorders like the F3. Let the main track clip and unlock/profit from the extra 20dB headroom that's in your R07! It allows you to set the level higher and therefor beat the noise (I hope)! :)
(2) How much work would it take to write a piece of software that accomplished what you're proposing automatically? Because doing it all manually would be a huge pain - not worth it at all unless you absolutely had to. If you are willing to actually write the software, though, it's a good idea!
It doesn't seem like rocket science and the heavy work of reading/writing audio files can be outsourced to available libraries like libsndfile. It depends a bit on how fancy and fool-proof it has to be, and if it should run on Linux/Windows/Mac, command line or graphical user interface etc, but I think a couple of hours work would be enough to get a working prototype... It would be fun to make, I just started a bit already.
The availability of real world example input files would be very helpful for testing! The recording described by Roffels seems like the perfect example: High dynamics and a clipping main track.
@Roffels: Do you still happen to have the main and safety track of the recording you described? And if so, would you be willing to share these with me for testing? That would be awesome!
EDIT: Roffels replied that he unfortunatly doesn't have these tracks anymore. So if there is any other R07 owner that has or can create a clipping main track and the accompanying safety track, please let me know. Even a clipping recording by your R07 in 2xWAV mode of your home stereo set would be sufficient!
-
The Roland R-07 deck allegedly has an "innovative Hybrid Limiter" function which is supposed to automatically combine a higher and lower level recording.
"Hybrid Limiter" is mentioned in most of their marketing material, but there is no official documentation as to what exactly it does and how it does it - at least not that I have encountered.
There is no mention of Hybrid Limiter in the owners manual. The quoted Roland FAQ below refer to more information in the Reference Guide p. 16, but that does not mention anything about the Hybrid Limiter.
A MusicRadar.com review states the following:
Though the R-07 is a stereo recorder, you can set it to create a WAV and MP3 file simultaneously (Wave+MP3 mode), or use 2xWAV mode to record a second WAV file padded down by 20 dB. If there’s any clipping in the louder file, you can edit between them or, using Hybrid Limiting mode, have the R-07 switch between them automatically.
The R-07 online FAQ tells us when it operates, but not what it does:
The Hybrid Limiter function on the R-07 operates under the following conditions.
"Limiter": ON
"Input Level": 60 or higher
* The Hybrid Limiter function does not operate when Rec Mode is set to "2xWAV (Dual Level Recording)."
* When the Limiter is ON and the Input Level is 59 or lower, or whenever the Dual Level Recording is being engaged, the normal (non-Hybrid) limiter will be activated.
* For more details about the Limiter function and its settings, refer to the Reference Guide (p. 16), "Using Limiter or AGC."
I have not tried it out to see how it performs. I was curious about the function when I got my R-07, but when I learned from the FAQ that it is only active when input level was set to 60 or more, it became irrelevant as my deck is always set lower when recording music.
-
I'm back in town after a month away and just came across this thread. Interesting stuff.
@TheJez-
My take on all this that you are spot on. You describe accurately how the new "32-bit recorders" operate, which might be more accurately described as those which "automatically switch between multiple ADCs" regardless of what output format they are instructed to write (32bit in most but not all cases). The most significant operational difference with the R-07 is that it simply stores the two resulting files that were recorded through parallel signal paths with different gain settings and does not do the "automatic switching" and combining part. As you speculate, the automatic switching and choice of file output format could be done afterward on the computer as a post-processing step. Would of course need the appropriate software routine to do that automatically to a similar standard of quality as achieved in the "32bit recorders". The "magic" of the process is probably routed in the details of the switching comparator routine.
As I understand it. Zoom and Deity "32 bit recorders" auto-switch between two ADCs, so the scheme you describe using the two output files from the R-07 would be most similar to their approach. SoundDevices apparently switches between 3 separate ADCs. Same idea, different implementation particulars.
Note- Some folks mentioned the noise floor of the 'safety track'.. there will also a more elevated noise floor in one or two of the multiple ADC paths through any "32 bit recorder". We just don't notice it because the switching between those paths is done automatically. When the SPL is low the channel with a higher noise floor is not in use, when SPL is high it masks the noise.
@Niels-
I also was unable to find any additional information on the "hybrid limiter" function of the R-07. Although it refers to p16 of the Reference Guide for more details, neither that page no any other in the reference guide, user manual, or system update summary has any mention of the "hybrid limiter" or its method of operation.
However, this gives us some hint-
* The Hybrid Limiter function does not operate when Rec Mode is set to "2xWAV (Dual Level Recording)."
That makes it sound like it might be using both parallel signal paths (when they are not already being employed by Dual Level Recording), and automatically switching between them as required. That is indeed quite similar to what the "32bit recorders" that automatically switch between multiple ADCs are doing. It may represent an early version of something very similar. I suspect the difference is in the sophistication of the automatic switching comparator.
20 years ago I schemed about a method of recording automatic markers that noted when gain was manually adjusted during a recording and by how much. That information could then be used afterward to automatically find and "undo" those manual gain changes as a post process in the DAW. Additionally, that functionality could be combined with ACG to provide an end result similar to what the 32bit float recorders are doing today. Rather than using multiple signal paths through the recorder and combining them in real time or afterwards as a post process, it would instead mark and record all the automatic gain changes, so as to undo them later. Requires immensely less data storage that way as the only thing additional being stored is the marker positions and gain values. Conceptually similar to what a compander does, or at a more fundamental and abstract level, how the representation of SACD data values works.
-
@gutbucket
Sounds plausible.
I have been mildly curious about this Hybrid Limiter since I bought my R-07.
If I didn’t already have plans of getting a Tascam FR-AV2, I’d probably be more motivated to explore what Hybrid Limiter does in practical use.
-
Some interesting theories here.
I will have to do some testing with my R07. It’s still my goto device for stealth recording, with 3.1v of PIP I have always found it okay with my CA mics without a battery box. With the Bluetooth and safety track it’s the ultimate stealth recorder imho.
Following the discussion here I need to test this device with the limiter enabled rather than the safety track. Maybe it will make upgrading to a small 32-bit recorder unnecessary. I already have that in my Mixpre-3 if needed.
I’ve only had to use the safety track once, and because it was a louder than expected show (obviously) I didn’t notice any noise. I will have to check back to see if I still have the files for TheJez.
-
I’ve only had to use the safety track once, and because it was a louder than expected show (obviously) I didn’t notice any noise. I will have to check back to see if I still have the files for TheJez.
That would be great, thanks! In fact I would already be very happy if you maybe could record in 2xWAV mode your home stereo playing some music for a couple of minutes, and the recording level set so high it will clip the main track considerably. Internal mics would be fine... :headphones:
-
@gutbucket
Sounds plausible.
I have been mildly curious about this Hybrid Limiter since I bought my R-07.
If I didn’t already have plans of getting a Tascam FR-AV2, I’d probably be more motivated to explore what Hybrid Limiter does in practical use.
The term "hybrid" is pretty vague and can mean different things. FYI, the Zoom F8 [original version, not sure about the later N, and current N-Pro (32bit) version] includes options for standard limiter and "Hybrid Limiter", and also has a safety track option, which when engaged cuts total channel count by half as expected. In the F8's case the hybrid limiter doesn't switch between primary and safety tracks, but rather implements a scheme using a variable compression ratio and look ahead. Rather than setting an activation threshold above which the limiter is engaged, the user instead sets a "not to exceed" level and a pad is engaged to increase available headroom. As the signal approaches the set value the limiter starts to engage about 10db lower than the target value and ratio increases as the signal approaches the target value.
Although I prefer setting levels and not using limiters, the output of my mics can sometimes clip the mic-level inputs of the F8 when recoding on stage near the drum kit. For that reason I started employing its Hybrid Limiter option. I was surprised to find it very transparent sounding with no obvious limiting artifacts, and also found that the pad did not cause an audible increase in the noise-floor, so I now leave it on all the time and either set levels to target peaking just below the presumed activation threshold, or turn gain all the way down for high SPL dynamic situations and just let it do its thing to keep peaks from exceeding 0dbFS.
I could leave the gain all the way down and just run it like that for all amplified shows - essentially the same way one runs a 32-bit recorder. I really only adjust gain out of habit, to achieve a good nominal starting level for editing, and as a back-door way to manage Mid/Side ratio, due to a playback quirk of the F8 not allowing ratio adjustment during playback. Not sure if I could do the same for classical material where the ambient noise-floor is considerably lower, but I generally use a different rig for that anyway.
I relate all this here because it is effectively yet another way of running without having to adjust gain.
Edit- I just realized Zoom actually terms the above limiter option "Advanced Limiter" rather than "Hybrid Limiter".. so, different name than Roland's, however it's mode of operation and how I use the recorder is really what I wanted to share.
-
@gutbucket
Sounds plausible.
Plausible indeed. However, if the the R07 is going to use both signal paths with this 'hybrid limitter' to combine it to a single 24 bit output, it has to lower the amplitude of the main track by 20dB to make some headroom to add data from the 'safety track path'. As we can't go above 0dbFS with 24 bit. Or something like that... The fact that this feature is so poorly documented and hardly marketed kind of hints that the quality of the used algoithms may be under par...
It seems Roland was on the right track towards multi-ADC recording, so a pitty they didn't push forward with this (yet).
-
Just been testing my R07 at home and think I can hear evidence for some of the theories mentioned here.
I always have my R07 set at 2xWAV-24bit with the safety track set at -12dB. The limiter is set to ON.
With my input level set at 60 I recorded some music from my hi-fi with the volume turned up to achieve clipping. I then recorded the same music at the same volume, but with the safety track turned off. The R07 displays the clipping lights in both instances.
On listening back to the files I can hear the limiter kicking in on the first recording, but on the second recording, the one with no safety track it sounds flawless.
It could be placebo effect, but I don’t think so. Could this be the mythical “hybrid limiter” at work?
-
This hybrid limiter is interesting indeed, but would you dare to use it on a recording, knowing you'd be just fine using the tried 2xWAV method?
Anyway, back to the original subject: Trying to auto-merge the main and safety tracks of a 2xWAV recording...
Thanks to Adrian I now have an R-07 recording consisting of a clipping main track and the safety track that I can use to analyze and test my program.
It turns out that my initial ideas were a bit naive. The clipping behaves more complicated than I expected, as already hinted in the SD patent description. It seems that very slight clipping makes a single or a few samples reach the max/min value of 24bit samples, and that's it. These seem rather easy to recover from the safety track. However, more severe clipping results in a rather long period (e.g. ~120ms) of mutilated samples. The simple idea 'replace samples > clipping threshold with an amplified sample from the safety track' will not work here. This requires far more intelligent signal processing to accurately 'detect & patch' those periods.
Also I'm having troubles to accurately enough detect the gain difference between left/right main & safety channels and to find out what the DC component is, and what the variation of gain and DC is for each channel. (I know, a varying DC component seems weird by definition, but not unrealistic due to temperature variations...)
The attached picture shows a part of such a 'major clipping' episode. Notice the chopped-off tops of the waves, which are not flat and are not on 0dB. I guess in these cases the pre-amp or the analog part of the ADC was overloaded and need some time to recover.
I'm affraid making a good enough merger requires skills and time beyond what I can offer. It has been fun to work on this and I learned a lot about 32bit float, dual ADC etc. and ended up buying the nice new Tascam FR-AV2 :yack:
-
Good explanation of the problem, tjanks
Practically, I would simply lower the .wav 2dB
Run RX8 declip
Normalize back to 0dB and call it done
You sweat too much.. ;D
-
@TheJez- Thanks for your exploration into the particulars of what it would take to make it work as a post-production process in a practical sense.
Just been testing my R07 at home and think I can hear evidence for some of the theories mentioned here.
I always have my R07 set at 2xWAV-24bit with the safety track set at -12dB. The limiter is set to ON.
With my input level set at 60 I recorded some music from my hi-fi with the volume turned up to achieve clipping. I then recorded the same music at the same volume, but with the safety track turned off. The R07 displays the clipping lights in both instances.
On listening back to the files I can hear the limiter kicking in on the first recording, but on the second recording, the one with no safety track it sounds flawless.
It could be placebo effect, but I don’t think so. Could this be the mythical “hybrid limiter” at work?
Seems that's the case. And seems to indicate the "hybrid limiter" does a better job than running a safety track, as it similarly accommodated the clipping yet without any perceivable limiting, and without the extra stored tracks to deal with or any additional work. That makes running the R-07 in that way as your regular mode of operation attractive.
At the very least, that test seems to indicate a good reason not to engage the standard limiter AND the safety track at the same time. If recording the safety track, I see no reason to also get limiting effects that use of the safety track will avoid along with outright clipping. Just remains to be seen if the hybrid limiter alone serves to cleanly accommodate more severe clipping that a safety track WIHTOUT the standard limiter engaged might be able to handle.
-
Seems that's the case. And seems to indicate the "hybrid limiter" does a better job than running a safety track, as it similarly accommodated the clipping yet without any perceivable limiting, and without the extra stored tracks to deal with or any additional work. That makes running the R-07 in that way as your regular mode of operation attractive.
At the very least, that test seems to indicate a good reason not to engage the standard limiter AND the safety track at the same time. If recording the safety track, I see no reason to also get limiting effects that use of the safety track will avoid along with outright clipping. Just remains to be seen if the hybrid limiter alone serves to cleanly accommodate more severe clipping that a safety track WIHTOUT the standard limiter engaged might be able to handle.
I have a gig on Saturday. I wasn’t going to record it but will now use it as an chance to experiment. I’m going to go in with my R07, no safety track and input level slightly higher than usual, and see what happens.
-
Just got back from the gig I mentioned above. DPA4061s > Battery Box> R07.
It was a loud gig, and I was close to the front.
I normally set the input level at 50 and use a -12dB safety track. Only once have I had to use the safety track.
On this occasion I set the input level at 60 to deliberately force clipping. First half of the show I employed the -12dB safety track with limiter, second half I disabled the safety track.
The main track in the first half exhibits signs of clipping, but I have to say the limiter does quite a good job, better than my M10, and the recording is very listenable. The safety track is the one I would work with though.
The second half recording is quite extraordinary though, flawless in fact. The limiter is definitely doing a better job when the safety track is not employed.
In future I think I will set the input level at 55 and disable the safety track. I’m buying the Tascam FR-AV2 but think the R07 is still the best stealth recorder.
-
No downside in keeping the safety track active is there? I would think just for peace of mind, if anything.
-
No downside in keeping the safety track active is there? I would think just for peace of mind, if anything.
The theory being that the R07 uses the second track for the “hybrid limiter” which is much better than the regular limiter. On the evidence of my gig last night I would be happy to rely on the improved limiter, rather than safety track, for most shows because it’s easier and the safety track does introduce some noise.
That said if it was a show I was desperate to get right I would probably revert to the safety track, or use the 32-bit float Tascam I am buying when it becomes available in the UK.
The R07 is just such a neat little device I can imagine I will want the Tascam in my packet too often.
-
Just a quick update regarding the initial intension of this topic and the developments so far:
My initial thought was to use a weighted average of the samples of the main track and the safety track, where a louder sample would get a higher percentage of the safety track sample and a softer sample would get a higher percentage of the main track sample.
I had quickly abandoned this idea, as it would mean that each output sample would get 'poluted' with data from the noisier safety track, even when not necessary at all.
Being inspired by the SD patent I couldn't put this fully to rest and continued to work on a 'safety track merger'. My method is a bit different than the method described in the patent but addresses the same issues. I've made quite some progress so far. The program currently works in two steps: The 'Preparation Phase' and the 'Merge Phase'.
In the 'preparation phase'
- The program reads the main and safety tracks, converts them to 32bit floating point, throws away all info below 10Hz to get rid of any DC component (including the 'varying DC' also addressed in the SD patent).
- It counts the number of clipping samples in the main track (for left and right channel). If there are no clipping samples, there is no need to merge. A list of clipping samples (sample number and timestamp) can be exported to the clipboard for evaluation purposes)
In the 'merge phase':
- A 'window' (of currently 64 samples) slides sample-by-sample over the main and safety track.
- For each window, the gain factor between main and safety track is determined as accurately as possible and the safety track is amplified by this gain factor to make it match the main track window.
- A 'difference' between main and amplified safety window is calculated. For a non-clipping window, this difference will be small. For a window including clipping (or other artifacts e.g. caused by analog stage overloading), the difference will be bigger.
- When main and safety windows are 'similar enough', the main window is added to the output. If they are 'too different' (i.e. there is clipping or other artifacts), the safety window is added to the output.
- Because of the 'sliding window' principle, a slow (64 sample) cross-fade between main and safety track is applied when switching between the two. So there is no hard transition from sample-to-sample. This should mitigate transition effects caused by (hopefully very low remaining ) inaccuracies of the gain factor calculation.
All is working, except for the part where the main or safety window are added to the output. Still working on that...
Thanks for your interest :) I'll keep you posted...
-
Clever. Like that you've set it up in a relatively simple, well managed way. Curious about the repercussions of detection window length "tuning".
Cool that you've continued work on this and thanks for the update!
-
Yes! Merging is working! I completely did not optimize the program for speed, but the numbers are not too bad:
- Preparation is very fast, ~10s for 30 minutes of audio
- Merging is a bit slower, roughly 4x 'realtime'. So ~7.5 minutes for 30 minutes of audio.
Both are substantial slower when 'visualize' is enabled. These only add nice dancing lines to the user interface, so are not needed.
Now I just need to clean up the code a bit (lots of leftovers from debugging/testing).
Below the screenshot of the user interface at the moment and a 32bit floating point merged R-07 recording from the files so kindly provided by Adrian. Notice the samples going above 0db! These were the parts were the main track was clipping. I did not add a normalization function, as your DAW can do that very well. It now also clearly identifies at which parts the safety track was used (samples > 0db), so I thought it would be better to keep it this way. The program also creates a log file of the merging, to show for what parts the safety track was used, e.g.:
...
00:00:25.196870: Right channel started fading in safety channel
00:00:25.197777: Left channel started fading in safety channel
00:00:25.197800: Left channel started fading out safety channel
00:00:25.198276: Right channel started fading out safety channel
00:00:25.608480: Left channel started fading in safety channel
00:00:25.609387: Right channel started fading in safety channel
00:00:25.610022: Left channel started fading out safety channel
00:00:25.610952: Right channel started fading out safety channel
00:00:25.896054: Right channel started fading in safety channel
...
-
That is awesome! I wonder if it would work with the lectrosonic spdr safety track too
-
That is awesome! I wonder if it would work with the lectrosonic spdr safety track too
There is no reason why it shouldn't work for another device. Currently, the only conditions are:
- Input files shall be stereo 16 or 24 bit wav files
- They shall be equal in length (up to the sample) and perfectly in sync
- They shall have identical sample rate
As you are asking about this particular recorder, I assume you have one of those. It would be awesome if you would be able to record 'anything' with it (e.g. your stereo playing some music for a few minutes) with the record level just a bit too high to make the main track clip. If you could provide the files I could do some testing. Or if I would provide the program, you could do the testing. However, if for some reason some tweaking of the program is needed, I guess it would be more convenient if I would get the files... :)
-
Yeah, I have one. I will record something this afternoon.
-
Yeah, I have one. I will record something this afternoon.
Great! No hurries…
-
I actually recorded another gig last week where the main track clipped if you are interested.
-
I actually recorded another gig last week where the main track clipped if you are interested.
Yeah, that would be great! The more the better…
Are you interested in getting the merged file of your previous recording?
-
https://drive.proton.me/urls/AYQMC4TW4W#BdVrBJKwCWxu
4015gs -> SPDR 48/24 with safety (all 4 channels in 1 file)
I tend to run very conservatively so I haven't needed the safety track.
-
Yeah, that would be great! The more the better…
Are you interested in getting the merged file of your previous recording?
Is the merged track actually an improvement over the safety track?
If you can send the link to your file sharing I’ll send a couple of files.
-
So I'd think it'd work the same way it's supposed to work on dual adc converters. You get the advantage of being able to run loud to maximize SNR, while protecting yourself from overage. The only time you have the increased noise floor is at loud points where the noise will be far below the actual signal level.
-
Is the merged track actually an improvement over the safety track?
If you can send the link to your file sharing I’ll send a couple of files.
To be honest: For this particular recording, there is no audible improvement over the safety file. Even the quietest parts (crowd noise between the songs) are well above the recorder self noise, even on the safety track.
I guess most benefit can be reached for recordings with very high dynamics, so very quiet and very loud parts.
Maybe you could turn the files to flac and zip them, then upload to wetransfer.com and pm or mail the link to me. They allow sharing files up to 2GB in size. I think you may still have my email address from the last time…
-
So I'd think it'd work the same way it's supposed to work on dual adc converters. You get the advantage of being able to run loud to maximize SNR, while protecting yourself from overage. The only time you have the increased noise floor is at loud points where the noise will be far below the actual signal level.
Yes, exactly. So we’ve turned this device into a 32bit floating point multi-adc recorder, with the only difference compared to others that the combining is done in post instead of in realtime… And you need to set the record level to a sensible value, although now we have 20dB more headroom to play with.
-
Just checked my files from last week’s gig and they’re a bit confusing. The main track is not clipping, but it is distorting, whilst the safety track is okay. Maybe the limiter kicking in but not doing a very good job. I don’t think these files will be any use for experimentation. Maybe I’ll stand in front of a load hi-fi when I get chance with limiters switched off.
-
Just checked my files from last week’s gig and they’re a bit confusing. The main track is not clipping, but it is distorting, whilst the safety track is okay. Maybe the limiter kicking in but not doing a very good job. I don’t think these files will be any use for experimentation. Maybe I’ll stand in front of a load hi-fi when I get chance with limiters switched off.
When you mention distortion without clipping, my first thought would be 'overloading the analog gain stage'. (Of course I don't know your gear, but could this be a case of relatively sensitive mics combined with loud audio?)
Then maybe the analog path towards the safety track was protected from overloading by the -6/-12/-20dB attenuation.
In fact it may be interesting for experimentation, as the algorithm isn't specifically looking for clipping, but looking for 'louder parts where there is a relevant difference between the main track and the amplified safety track'. I expect that distorted parts of the main track would trigger the algorithm to use the safety track for those parts. However, practically, in this case the safety track is probably loud enough to be used as end result of your recording (unless there are very big dynamic differences in the recording).
So I'm open to and interested in testing this recording anyway, but I'd fully understand if you don't want to bother...
When you'll make a recording of your hifi, it doesn't necessarily needs to be set very loud (so you can keep the neighbors happy ;)). Just put the mics close to the speaker and crank up the record level enough to make the main track clip!
-
https://drive.proton.me/urls/AYQMC4TW4W#BdVrBJKwCWxu
4015gs -> SPDR 48/24 with safety (all 4 channels in 1 file)
I tend to run very conservatively so I haven't needed the safety track.
Thanks grawk! I've manually split the 4-track recording into two stereo files so my program can open them. I could of course make a special provision in the program to handle these kind of 4-track files. Is the safety file always stored like this in this particular device, or can you choose between 2xstereo or 1x4-track ?
And thanks for opening the can of soda (or whatever it is ;D) at 17 seconds, as it adds some nice dynamics to the recording!
I will let you know there result, probably in the coming days...
-
https://drive.proton.me/urls/AYQMC4TW4W#BdVrBJKwCWxu
4015gs -> SPDR 48/24 with safety (all 4 channels in 1 file)
I tend to run very conservatively so I haven't needed the safety track.
Thanks grawk! I've manually split the 4-track recording into two stereo files so my program can open them. I could of course make a special provision in the program to handle these kind of 4-track files. Is the safety file always stored like this in this particular device, or can you choose between 2xstereo or 1x4-track ?
And thanks for opening the can of soda (or whatever it is ;D) at 17 seconds, as it adds some nice dynamics to the recording!
I will let you know there result, probably in the coming days...
I couldn't resist a quick check. The merged file still has quite some distortion leaking through from the main file. The main file is so distorted that there are many periods of consecutive clipping samples that are bigger than the window of 64 samples. In these situations it is not possible to determine the gain difference, and right now the algorithm uses the main track in that situation, which is plain stupid! I will need to update that. I guess I should use the lastly calculated gain instead, which was calculated not too long ago in the stream...
This proves the value of multiple test inputs! Thanks a bunch!
-
I was probably driving the preamps too hard. I was just recording music in my car from the radio.
-
https://drive.proton.me/urls/AYQMC4TW4W#BdVrBJKwCWxu
4015gs -> SPDR 48/24 with safety (all 4 channels in 1 file)
I tend to run very conservatively so I haven't needed the safety track.
Thanks grawk! I've manually split the 4-track recording into two stereo files so my program can open them. I could of course make a special provision in the program to handle these kind of 4-track files. Is the safety file always stored like this in this particular device, or can you choose between 2xstereo or 1x4-track ?
And thanks for opening the can of soda (or whatever it is ;D) at 17 seconds, as it adds some nice dynamics to the recording!
I will let you know there result, probably in the coming days...
I couldn't resist a quick check. The merged file still has quite some distortion leaking through from the main file. The main file is so distorted that there are many periods of consecutive clipping samples that are bigger than the window of 64 samples. In these situations it is not possible to determine the gain difference, and right now the algorithm uses the main track in that situation, which is plain stupid! I will need to update that. I guess I should use the lastly calculated gain instead, which was calculated not too long ago in the stream...
This proves the value of multiple test inputs! Thanks a bunch!
Very cool work you're doing!
Just out of curiosity: I'm assuming that you're calculating the gain difference that way out of an abundance of caution? Because I wouldn't expect gain to drift wildly - and where it does, I would think it'd be pretty transparent to most ears? Using the last calculated gain doesn't sound like it would cause any issues (especially compared to eyeballing it and doing it manually).
-
Very cool work you're doing!
Thanks a lot! It's very interesting to work on, although my spare time to do this is limited. As I don't even own a recorder that produces a safety track, it is nice to see there is some interest in this project :)
Just out of curiosity: I'm assuming that you're calculating the gain difference that way out of an abundance of caution? Because I wouldn't expect gain to drift wildly - and where it does, I would think it'd be pretty transparent to most ears? Using the last calculated gain doesn't sound like it would cause any issues (especially compared to eyeballing it and doing it manually).
I'd like to say: "Scientifically it was proven that this is the best way to do it" but that would be complete BS. It is more of a practical reason: Now I can jump to any position in the file, read the samples to fill the window and verify what the algorithm makes of it. It is a 'stateless algorithm', so to say, not depending on what's been happening before. Very handy for testing & debugging (especially as the preparation phase produces a list of locations where the main track is very likely clipping), but not so much for computing efficiency.
Maybe I will change the gain determination in a speed optimization phase, if there ever is going to be one. Or if my current way won't work well enough under certain conditions. We'll see! Work-in-progress...
-
https://drive.proton.me/urls/6HB94T3XG0#zN9fmeuxhO9i
The SPDR seems to have a permanent limiter, so it may not trigger your merge algorithm.
This should be a more realistic test
-
https://drive.proton.me/urls/6HB94T3XG0#zN9fmeuxhO9i
The SPDR seems to have a permanent limiter, so it may not trigger your merge algorithm.
This should be a more realistic test
Thanks, I got the file! More realistic indeed :bigsmile:.
What makes you think there is a permanent limiter? Would that be active on main and/or safety track? If it is, then it's not doing a very good job ;)...
See screenshot of a piece of the file. It shows 'classic' clipping at the top of the wave, and another kind of distortion, likely caused by analog stage overloading at the bottom. Either way, my algorithm should determine that there's a big difference here between main and amplified safety track, and should have faded in the safety track for this part...
-
I do need to re-think the gain difference calculation... :(
The recording I got from grawk contains periods with some very severe clipping, and during these periods the complete main track is rather damaged, even between the clipping samples. This results in very wrong calculated gain difference for some time. (E.g. factor ~12 instead of the expected factor ~7.78)
The algorithm does nicely conclude to use the safety track for these parts, but is using the incorrect gain to amplify the safety track, which will result in undesired behavior of the merged track...
Work in progress!! :hmmm: I think I might be heading to some running average, determined over periods where there is no clipping or clipping-related artifacts. (But how to determine that? I need a gain to amplify the safety track so I can compare main and safety to see if there are such artifacts... I might run in a chicken/egg situation here...)
-
What about an option of specifying the safety margin and whenever you detect clipping just replace it with the safety track + that fixed difference?
-
What about an option of specifying the safety margin and whenever you detect clipping just replace it with the safety track + that fixed difference?
I really don't want the user to have to specify anything, especially since the user will not know the difference (in dB) exactly. When you e.g. select -12dB difference in your recorder, you'll get something like -12.023 or -11.984 or whatever, but certianly not 12.000000dB. And there will be different offsets for left and right channel and different offsets depending on temperature... It's not just for fun that SD patented mechanisms to deal with such variations, and I'm sure all other multi-ADC recorders will do similar smarts things too to combine the output of the ADC's. If the difference used to amplify the safety tracks deviates too much from 'reality', it might introduce audible artifacts around switching moments, which is what I'd really like to prevent. The gradual cross-fade between main and safety tracks already helps a lot there, but still I'd like to calculate the difference as accurately as possible.
-
I've come up with an algorithm that determines a kind of 'average factor' for left and right in blocks of 1 minute of audio. Here's the result of Grawk's realistic sample of 4 minutes of a SPDR recorder:
=== Block 0 AvgAfterCleanupLeft= 7.78647 ( -17.8268 dB ) AvgAfterCleanupRight= 7.8069 ( -17.8496 dB)
=== Block 1 AvgAfterCleanupLeft= 7.78781 ( -17.8283 dB ) AvgAfterCleanupRight= 7.80856 ( -17.8514 dB)
=== Block 2 AvgAfterCleanupLeft= 7.79205 ( -17.833 dB ) AvgAfterCleanupRight= 7.80805 ( -17.8509 dB)
=== Block 3 AvgAfterCleanupLeft= 7.7921 ( -17.8331 dB ) AvgAfterCleanupRight= 7.80762 ( -17.8504 dB)
The algorithm works like this:
1. Read 1 minute of audio from main and safety track
2. Go through each 'frame' (= set of left/right main sample and left/right safety sample). Process left and right separately in the next steps
3. If both main & safety samples are positive or both are negative, and if both have 'substantial magnitude' (> -80dBFS, to prevent using very noisy samples) and if neither is 'too loud' (< -3dBFS, to prevent using clipping samples and damaged samples from analog overloading):
- Calculate factor main/safety. This is 'the factor' for this single set of samples
- Truncate the factor to 4 digits behind the decimal separator.
- Put the value in a map, counting the number of occurences per value
4. Then, using the value that occured most in the minute of audio:
- Calculate a weighted average of all values in the map that are within 1% of the value that occured most. (Using 'value * times of occurence' of all map entries within that 1% window) This will be 'the factor' for this channel for this minute!
5. Process next minute of audio
I'm sure there will be better and more efficient ways to achieve this, but I think this should result in a reasonably accurate factor between main and safety track for that minute of audio.
Next, I can calculate 'the factor' for any position within the file by doing linear interpolation of the 'factor per minute'. I'm curious what the results will be of this, but first I need to mow the lawn, hopefully for the last time this year before the winter kicks in ;D
-
Yes! I'm perfectly happy with the new gain determination algorithm :headphones:! The difference between main and amplified safety track is now much smaller in non-clipping parts, making the clipping/distorted parts stand out perfectly. I noticed that the clipping recording from the SPDR device get quite some 'dc offset' in the period directly after the clipping event which gradually declines back to 0. These offsets ruined my previous window-by-window based gain algorithm but obviously the 'minute-based algorithm' rejects such parts from participating in the gain calculation.
I'm not sure where to go from here. The program is rather slow (~33% of realtime, so 30 minutes of audio takes ~10 minutes processing) and there is very much room for speed improvements, but I'm not sure if it's worth the effort... It's also not very monkey-proof. One needs to press the buttons in the right order, otherwise it may crash :really_sucks:
Would anybody be interested in getting this program at all? It's Windows-only at the moment...
-
I’m quite in awe of what TheJez has done here. I’ve just been listening to the merged file of my clipping main track and safety track and it’s seamless.
Although I have just acquired a Zoom H1 XLR 32-bit recorder the Roland R-07 will remain my stealth recorder of choice. The form factor is great, I can even blag it’s an mp3 player, and the 3.1v of PIP voltage is just enough to power my Church Audio and DPA mics. I know many will say it’s not, but I have never had an issue. I tend not to use a battery box these days just for added simplicity.
I’m guessing I’m the perfect person to try and utilise the program, but given the increasing number of 32-bit recorders appearing it might be a market of one!
-
Sorry I'm only joining now. TheJez, thank you for the very interesting posts. I like it a lot. You have my respect, I really like what you came up with and implemented.
-
I'd intended to post last Friday but got pulled away, and it now sounds like progress has been made and you've got the detection algorithm well tuned. Congrats!
Since then..
Work in progress!! :hmmm: I think I might be heading to some running average, determined over periods where there is no clipping or clipping-related artifacts. (But how to determine that? I need a gain to amplify the safety track so I can compare main and safety to see if there are such artifacts... I might run in a chicken/egg situation here...)
Lurking in the back of my mind was the idea of using phase correlation rather than a weighted average level comparison as "difference factor", but if the detector is now working well a there is no need to explore alternate approaches.
-
[snip..] Would anybody be interested in getting this program at all? It's Windows-only at the moment...
tl;dr- some of this gets OT, so for those who aren't in to it, please ignore.
I'll throw out a couple potential alternate use cases which will apply to other tapers generally, and also a oddball concept of mine I've thought about for years.. which I don't expect this program to evolve into, but your development of it has gotten me thinking about it again and I'd love to discuss it more in depth with anyone interested here or elsewhere (happy to take it to another thread, PM or offsite).
Relatively common taper scenario 1-
I've not recorded using a secondary safety track on the same recorder myself, which is the intent of this routine. But I have recorded a safety backup to a secondary recorder at times, as do other tapers. Most of the time that safety recording isn't needed, in the same way that a lower-level safety track made on the same recorder isn't, and when it is most folks will simply discard the primary recording and use that secondary safety recording. However, this program could potentially auto switch between the two, using the primary recording wherever possible. The potential problem with doing that would seem to be achieving a sufficient degree of sync between the two recordings, especially if the back up recorder didn't share the same clock (probably most of the time). Sync that is otherwise audibly "good enough" for mixing AUD and SBD via the typical post process of aligning and stretching may not be close enough for the detector.
Relatively common taper scenario 2-
Plenty of tapers find themselves needing to deal with dropouts, intermittencies, or other brief problems in one channel of a typical 2-channel stereo recording. Most often the solution is a cross-fade to the other channel and back again. That causes the recording to briefly cross-fade to mono and back, which is unfortunate yet is an improvement over doing nothing or simply fading to silence in the damaged channel. This program might automate that process, but may require significantly looser detector settings that are only triggered by the intended obvious problems and not by desirable stereo difference between the two channels.
Oddball scenario 1-
My oddball case is similar to common taper scenario 2 and involves a couple decade of stealth recordings made with a four channel mic rig. That rig started as two complete identical stereo rigs, one assigned to Left/Right and the other to Center/Back, with alignment and sync between the two achieved in post. That evolved to using a single recorder for all four channels (Tascam DR2d) making operation and post processing far simpler. It then further progressed from using two identical preamps to a single 4-channel preamp. But regardless of that evolution, inevitably there were times when one recorder hickuped or failed, or one preamp battery died, or most commonly- one wire or connector went intermittent, crackly, poppy or whatever. I've a significant number of recordings that are compromised in that way. Most of the time the solution is just to not use the bad channel in the mix, or to manually cross fade around the problem from one of the good channels.. This is essentially the same situation as common taper scenario 2, except there are two or three remaining good channels instead of just one to cross-fade from, some a bit more different than others.
Oddball pipe dream- a further improvement for scenario 2, made robust by the presence of additional channels-
The content of the two channels of most any stereo recording differ.. to some extent. Yet are also the same.. to some other extent. In a concert taper recording, some of the particulars of how they are the same and differ will be specific to the recording setup used - specifically and in large degree a consequence of the stereo microphone arrangement. There will be signal relationships specific to: a stereo pair of mics of some particular pickup pattern, spaced a certain distance apart, angled a certain angle apart. Some of the relationships between the two channels will be present in all recordings made with that setup.. as long as that setup remains unchanged. Additionally, some additional aspects will be specific to each specific recording situation. Those relationships will remain constant between channels for that particular recording as long as the recording location doesn't change over the course of the recording, but will differ between various recordings even though the same recording setup was used. The point is that there is useful information about the stereo similarities and differences between channels which gets encoded into each recording and remains constant throughout the recording. Information that is specific to the recording arrangement, and additionally to that specific recording arrangement in each specific recording situation. We should be able to use that to our advantage.
How can we extract that information and use it to filter the replacement cross-faded content so that its no longer just a mono copy of an alternate good channel, but rather is imbued with whatever typical stereo difference information is typical to that particular recording setup.. and more specifically to that particular recording setup in that particular recording situation?
With a four channels (or more) rather than just two, the encoded information about the recording array and the specific situation in which it is used becomes far more robust. As channel count increases arithmetically, the cross-relationships between each chanel and channel groupings increases geometrically.
I dream of a program which analyzes a recording made with a static, unchanging multichannel microphone array, determines useful things about the cross-correlations between all channels, and uses that to synthesize a convincing missing channel from the channels that remain. Some day..
-
I dream of a program which analyzes a recording made with a static, unchanging multichannel microphone array, determines useful things about the cross-correlations between all channels, and uses that to synthesize a convincing missing channel from the channels that remain. Some day..
These are some interesting thoughts. Your scenario 1 crossed my mind as well. I think it should be possible to auto-sync these recordings. I guess most success can be achieved when both recorders are recording the exact same microphone or SBD outputs. If the two recorders use a different set of microphones, then I think I wouldn't want to be auto-switching between the sound of the two recorders.
The other scenarios... I'd think are extremely difficult. There are so many aspects that would determine 'the content of the missing channel', and indeed some of them could be derived from analyzing the channels that are available and 'the missing channel when it was still there', but some of them are impossible to determine. E.g. with a two-channel recording, the source location (distance,direction) of some particular sound combined with the frequencies in the sound and the room acoustics have a huge influence on how that sound is picked up by the two microphones (phase, amplitude). Now if you'd take out one channel, there is no way to determine what the distance/direction of a specific sound is, so there is also no way to determine how it would have sounded on the missing channel. And then imagine that the recordings we make are a mix of countless sounds from all kinds of amplitudes, phases and frequency compositions from all directions that affect each other and bounce around before hitting the microphone membrames.
I'd guess the dreams you have cannot be solved by classic signal processing, but maybe with the aid of AI it may be possible to fill short gaps caused by dropouts etc. with more ore less realistic fillings. Waaay out of my league! I think it would be better to invest in better cabling to prevent the dropouts in the first place ;)
One other thing that crossed my mind while working on this program:
I think I found a way to reduce a recorder self noise by any desired amount. Might apply for patent for that! Or sell the idea to the highest bidder! Zoom, Tascam, SD, Sony, Roland, you're all invited to contact me! :cheers:
-
Sorry I'm only joining now. TheJez, thank you for the very interesting posts. I like it a lot. You have my respect, I really like what you came up with and implemented.
Thanks Kuba e! I must confess I am a bit proud of what I achieved. Just a pitty I didn't do this when the first recorders with a safety track feature came out. Nowadays the 32bit fp multi-adc recorders have become the standard, making the safety track feature superfluous... :(
-
Thanks for the reply. Won't dwell on this excessively, but do want to flesh it out a bit more before I let it go, as I think this is the first time I've actually put it down in words (have verbally discussed the concept with a taper with an acoustics PHD)
These are some interesting thoughts. Your scenario 1 crossed my mind as well. I think it should be possible to auto-sync these recordings. I guess most success can be achieved when both recorders are recording the exact same microphone or SBD outputs. If the two recorders use a different set of microphones, then I think I wouldn't want to be auto-switching between the sound of the two recorders.
The biggest challenge for scenario 1 is likely to be achieving sufficient phase-locked synchronization between the two files sets recorded to two different non-clock linked recorders, which is a non-issue when the safety is track recorded on the same recorder as the primary track. Good sync in human perceptual terms need only be achieved to within some 10's of samples, depending on sample rate, in some cases an order of magnitude looser than that. In scenario 2, auto-switching between different microphones becomes the essence of the thing. That's the more interesting one I want to get into a bit more..
The other scenarios... I'd think are extremely difficult. There are so many aspects that would determine 'the content of the missing channel', and indeed some of them could be derived from analyzing the channels that are available and 'the missing channel when it was still there', but some of them are impossible to determine. E.g. with a two-channel recording, the source location (distance,direction) of some particular sound combined with the frequencies in the sound and the room acoustics have a huge influence on how that sound is picked up by the two microphones (phase, amplitude). Now if you'd take out one channel, there is no way to determine what the distance/direction of a specific sound is, so there is also no way to determine how it would have sounded on the missing channel. And then imagine that the recordings we make are a mix of countless sounds from all kinds of amplitudes, phases and frequency compositions from all directions that affect each other and bounce around before hitting the microphone membrames.
I'd guess the dreams you have cannot be solved by classic signal processing, but maybe with the aid of AI it may be possible to fill short gaps caused by dropouts etc. with more ore less realistic fillings.
Yes quite challenging to do perfectly. But we don't need perfection in the emulated channel, just some improvement over the use of a straight copy of one of the remaining channels, and the entire endeavor is made easier by having a lot of perceptual leeway available. Some of it would be relatively easy, other aspects considerably more difficult. It could/can be pursued to various degrees.. making it ripe for continuing further improvement via upgraded releases. AI may be quite useful for some more advanced approaches, but isn't as necessary as you might think.
As I see it are three different sources from which we can draw useful information:
The first is the physical geometry of the microphone configuration. We needn't reference the actual real-world configuration at all for this. Can be used to easily figure things like the upper limits of the phase, level, and timing differences between channels across specific frequency ranges. Also simple geometric info like how those differences manifest in regard to the direction (or non-direction) of sound arrival - direct sound arrival from straight ahead will produce no phase, level or timing differences; while sound arrival from maximally off the stereo axis will never exceed some maximal timing difference determined by the spacing between mics; and will similarly never exceed some maximum value of phase and level difference, which will vary specifically by frequency. All that is essentially no different than the inputs to long available virtual microphone configuration visualizers such as the Sengpielaudio https://sengpielaudio.com/HejiaE.htm (https://sengpielaudio.com/HejiaE.htm) and Shoeps Image Assistant visualizers http://ima.schoeps.de/ (http://ima.schoeps.de/). Schoeps Image Assistant even graphs the range of timing, level, and diffuse field correlation by frequency, based entirely on the geometry of the microphone configuration. It even includes an auralization routine that supposedly lets the user listen to an emulation of the microphone configuration while moving the source position around, but I've never gotten the auralizer to work.
The second is the extension of that to measurements of the actual real-world system. Determining more specifically the relationship between the channel to be replaced with the other available channels when the channel to be replaced is included in the set and working correctly. Most easily determined by recording test signals with the properly working microphone arrangement - say, a fully diffuse reverberant response, the response of direct arrival from a few specific directions, stuff like that. But it could also use an existing recording as stimulus in place of or in addition to such isolated test signals. Bob McCarthy figured out how to do that 4 decades ago using classic signal processing in collaboration with John Meyer and Don Pearson when developing Meyer SIM (Source Independant Measuring) which uses live sound itself as the test signal. Many offsprings of that tech since then based in classic signal processing.
The third is the relationship between all the remaining good channels with the damaged one is absent from the set, which frequently includes a channel that symmetrically mirrors the missing one.
Izotope devs are you lurking? A stereo/multichannel microphone-array replacement channel tool would be a welcome tool Izotope RX! I'd buy it.
-
^ philosophically, it would basically be a vastly improved "mono to pseudo-stereo" tool
-
Thanks for the reply. Won't dwell on this excessively, but do want to flesh it out a bit more before I let it go, as I think this is the first time I've actually put it down in words (have verbally discussed the concept with a taper with an acoustics PHD)
These are some interesting thoughts. Your scenario 1 crossed my mind as well. I think it should be possible to auto-sync these recordings. I guess most success can be achieved when both recorders are recording the exact same microphone or SBD outputs. If the two recorders use a different set of microphones, then I think I wouldn't want to be auto-switching between the sound of the two recorders.
The biggest challenge for scenario 1 is likely to be achieving sufficient phase-locked synchronization between the two files sets recorded to two different non-clock linked recorders, which is a non-issue when the safety is track recorded on the same recorder as the primary track. Good sync in human perceptual terms need only be achieved to within some 10's of samples, depending on sample rate, in some cases an order of magnitude looser than that. In scenario 2, auto-switching between different microphones becomes the essence of the thing. That's the more interesting one I want to get into a bit more..
The other scenarios... I'd think are extremely difficult. There are so many aspects that would determine 'the content of the missing channel', and indeed some of them could be derived from analyzing the channels that are available and 'the missing channel when it was still there', but some of them are impossible to determine. E.g. with a two-channel recording, the source location (distance,direction) of some particular sound combined with the frequencies in the sound and the room acoustics have a huge influence on how that sound is picked up by the two microphones (phase, amplitude). Now if you'd take out one channel, there is no way to determine what the distance/direction of a specific sound is, so there is also no way to determine how it would have sounded on the missing channel. And then imagine that the recordings we make are a mix of countless sounds from all kinds of amplitudes, phases and frequency compositions from all directions that affect each other and bounce around before hitting the microphone membrames.
I'd guess the dreams you have cannot be solved by classic signal processing, but maybe with the aid of AI it may be possible to fill short gaps caused by dropouts etc. with more ore less realistic fillings.
Yes quite challenging to do perfectly. But we don't need perfection in the emulated channel, just some improvement over the use of a straight copy of one of the remaining channels, and the entire endeavor is made easier by having a lot of perceptual leeway available. Some of it would be relatively easy, other aspects considerably more difficult. It could/can be pursued to various degrees.. making it ripe for continuing further improvement via upgraded releases. AI may be quite useful for some more advanced approaches, but isn't as necessary as you might think.
As I see it are three different sources from which we can draw useful information:
The first is the physical geometry of the microphone configuration. We needn't reference the actual real-world configuration at all for this. Can be used to easily figure things like the upper limits of the phase, level, and timing differences between channels across specific frequency ranges. Also simple geometric info like how those differences manifest in regard to the direction (or non-direction) of sound arrival - direct sound arrival from straight ahead will produce no phase, level or timing differences; while sound arrival from maximally off the stereo axis will never exceed some maximal timing difference determined by the spacing between mics; and will similarly never exceed some maximum value of phase and level difference, which will vary specifically by frequency. All that is essentially no different than the inputs to long available virtual microphone configuration visualizers such as the Sengpielaudio https://sengpielaudio.com/HejiaE.htm (https://sengpielaudio.com/HejiaE.htm) and Shoeps Image Assistant visualizers http://ima.schoeps.de/ (http://ima.schoeps.de/). Schoeps Image Assistant even graphs the range of timing, level, and diffuse field correlation by frequency, based entirely on the geometry of the microphone configuration. It even includes an auralization routine that supposedly lets the user listen to an emulation of the microphone configuration while moving the source position around, but I've never gotten the auralizer to work.
The second is the extension of that to measurements of the actual real-world system. Determining more specifically the relationship between the channel to be replaced with the other available channels when the channel to be replaced is included in the set and working correctly. Most easily determined by recording test signals with the properly working microphone arrangement - say, a fully diffuse reverberant response, the response of direct arrival from a few specific directions, stuff like that. But it could also use an existing recording as stimulus in place of or in addition to such isolated test signals. Bob McCarthy figured out how to do that 4 decades ago using classic signal processing in collaboration with John Meyer and Don Pearson when developing Meyer SIM (Source Independant Measuring) which uses live sound itself as the test signal. Many offsprings of that tech since then based in classic signal processing.
The third is the relationship between all the remaining good channels with the damaged one is absent from the set, which frequently includes a channel that symmetrically mirrors the missing one.
Izotope devs are you lurking? A stereo/multichannel microphone-array replacement channel tool would be a welcome tool Izotope RX! I'd buy it.
Thanks for elaborating. Although I understand your general ideas, I'm afraid it is a bit over my head to really properly judge or challenge them. I guess the taper with an acoustics PHD is indeed a much better candidate to discuss these matters, and the iZotope people seem much better candidates to implement them!
-
Thanks for the ear!
-
Hi everybody,
It's been a bit quiet for a while, which doesn't mean I put this project to rest.
I've made some updates to the program, leaving the basic algorithms intact:
- Removed some real-time eye-candy like the moving waveforms and frequency histograms
- Now made it a one-button process, instead of having to click a button for each step
- Re-structured the calculations. Previously, if the file had e.g. 1.000.000 samples and the sliding window was 128 samples wide, then many calculations were done 128.000.000 times. So many identical calculations were done 128 times while only one was really needed.
- Added option to normalize the end result
- Added support for the SPDR recorder: If you open the 4-track file created by the 'split-gain feature', the program will offer to split this into two stereo files. So you won't have to split the 4-track file yourself.
- Created a 'review tab', where the results of the merging process can be reviewed.
- Added an 'about tab' with a short description of the program and a nice program logo :)
This all resulted in a program that's about 10x faster than the prototype. E.g. 34 minutes of audio now takes ~50 seconds to process, including normalization. (Obviously this depends on the speed of your CPU and disk drive, so YMMV)
I've attached some screenshots below:
- Main tab: Buttons to open the main and safety file and the 'start button'. Also the checkbox to enable normalizing the merged file.
- Review tab: On the right side there is a list of interesting events per channel: Potential Clipping Left/Right, Fade In Safety Left/Right, Fade Out Safety Left/Right. By clicking these events, the waveforms of the clicked event is shown. You can move forward/backward in the waveform by using the buttons.
- The about tab containing a short description of the program
I will try to send the program out to some people for review/testing...
-
cool!
-
TheJez, Congratulation! It looks beautiful.
-
I can't wait to test this. well done.
-
There seem to be quite some Mac users here on the TS forum! Unfortunately I have totally no experience with software deveopment for Mac and the program was made on/for Windows machines. However, I'm trying to get a (virtual) Mac machine so I can attempt to build it for Mac's too... No idea how things work out, I'll keep you posted...
-
I’m going to see a gig (Echobelly) tonight and think I will deliberately set my R07 a little hot just so I can test the program. :)
-
I’m going to see a gig (Echobelly) tonight and think I will deliberately set my R07 a little hot just so I can test the program. :)
Oh my…. The responsibility… Thanks for putting your trust in the program…
-
Nice work! Happy to see it polished up and tuned so nicely. Upon further testing it would be great get a permanent link to the program setup over in the Post Processing forum.
-
Nice work! Happy to see it polished up and tuned so nicely. Upon further testing it would be great get a permanent link to the program setup over in the Post Processing forum.
Thanks for the suggestion, I was already wondering what would be a good place to put this... It seems the attachment size limitations won't allow me to store it on TS, so I'd need to rely on things like Google Drive. Unless someone has better suggestions. The size will likely be around 15MB.
I'm still awaiting feedback from a few testers on Windows and I have also been trying to build it for MacOS, as quite some people here seem to be running that. But as a non-Mac-user I have very limited options to properly test myself. So far I got a crash report from a test volunteer on the Mac... :(
I might be publishing the Windows version first and then continue work on the Mac version...
-
I've published version 1.0.0 (for windows) here:
https://taperssection.com/index.php?topic=206443.0 (https://taperssection.com/index.php?topic=206443.0)
-
For those on MacOS: I've finally managed a (hopefully) working MacOS version! See here: https://taperssection.com/index.php?topic=206443.msg2422745#msg2422745 (https://taperssection.com/index.php?topic=206443.msg2422745#msg2422745)