Author Topic: Roland R-07 32-bit floating point! (Read 95115 times)

kuba e · « **Reply #60 on:** November 18, 2024, 02:07:16 PM »

Sorry I'm only joining now. TheJez, thank you for the very interesting posts. I like it a lot. You have my respect, I really like what you came up with and implemented.

Gutbucket · « **Reply #61 on:** November 18, 2024, 05:53:23 PM »

I'd intended to post last Friday but got pulled away, and it now sounds like progress has been made and you've got the detection algorithm well tuned. Congrats!

Since then..

Quote from: TheJez on November 15, 2024, 06:51:10 AM

Work in progress!! I think I might be heading to some running average, determined over periods where there is no clipping or clipping-related artifacts. (But how to determine that? I need a gain to amplify the safety track so I can compare main and safety to see if there are such artifacts... I might run in a chicken/egg situation here...)

Lurking in the back of my mind was the idea of using phase correlation rather than a weighted average level comparison as "difference factor", but if the detector is now working well a there is no need to explore alternate approaches.

Gutbucket · « **Reply #62 on:** November 18, 2024, 06:22:03 PM »

Quote from: TheJez on November 17, 2024, 08:21:23 AM

[snip..] Would anybody be interested in getting this program at all? It's Windows-only at the moment...

tl;dr- some of this gets OT, so for those who aren't in to it, please ignore.

I'll throw out a couple potential alternate use cases which will apply to other tapers generally, and also a oddball concept of mine I've thought about for years.. which I don't expect this program to evolve into, but your development of it has gotten me thinking about it again and I'd love to discuss it more in depth with anyone interested here or elsewhere (happy to take it to another thread, PM or offsite).

Relatively common taper scenario 1-
I've not recorded using a secondary safety track on the same recorder myself, which is the intent of this routine. But I have recorded a safety backup to a secondary recorder at times, as do other tapers. Most of the time that safety recording isn't needed, in the same way that a lower-level safety track made on the same recorder isn't, and when it is most folks will simply discard the primary recording and use that secondary safety recording. However, this program could potentially auto switch between the two, using the primary recording wherever possible. The potential problem with doing that would seem to be achieving a sufficient degree of sync between the two recordings, especially if the back up recorder didn't share the same clock (probably most of the time). Sync that is otherwise audibly "good enough" for mixing AUD and SBD via the typical post process of aligning and stretching may not be close enough for the detector.

Relatively common taper scenario 2-
Plenty of tapers find themselves needing to deal with dropouts, intermittencies, or other brief problems in one channel of a typical 2-channel stereo recording. Most often the solution is a cross-fade to the other channel and back again. That causes the recording to briefly cross-fade to mono and back, which is unfortunate yet is an improvement over doing nothing or simply fading to silence in the damaged channel. This program might automate that process, but may require significantly looser detector settings that are only triggered by the intended obvious problems and not by desirable stereo difference between the two channels.

Oddball scenario 1-
My oddball case is similar to common taper scenario 2 and involves a couple decade of stealth recordings made with a four channel mic rig. That rig started as two complete identical stereo rigs, one assigned to Left/Right and the other to Center/Back, with alignment and sync between the two achieved in post. That evolved to using a single recorder for all four channels (Tascam DR2d) making operation and post processing far simpler. It then further progressed from using two identical preamps to a single 4-channel preamp. But regardless of that evolution, inevitably there were times when one recorder hickuped or failed, or one preamp battery died, or most commonly- one wire or connector went intermittent, crackly, poppy or whatever. I've a significant number of recordings that are compromised in that way. Most of the time the solution is just to not use the bad channel in the mix, or to manually cross fade around the problem from one of the good channels.. This is essentially the same situation as common taper scenario 2, except there are two or three remaining good channels instead of just one to cross-fade from, some a bit more different than others.

Oddball pipe dream- a further improvement for scenario 2, made robust by the presence of additional channels-
The content of the two channels of most any stereo recording differ.. to some extent. Yet are also the same.. to some other extent. In a concert taper recording, some of the particulars of how they are the same and differ will be specific to the recording setup used - specifically and in large degree a consequence of the stereo microphone arrangement. There will be signal relationships specific to: a stereo pair of mics of some particular pickup pattern, spaced a certain distance apart, angled a certain angle apart. Some of the relationships between the two channels will be present in all recordings made with that setup.. as long as that setup remains unchanged. Additionally, some additional aspects will be specific to each specific recording situation. Those relationships will remain constant between channels for that particular recording as long as the recording location doesn't change over the course of the recording, but will differ between various recordings even though the same recording setup was used. The point is that there is useful information about the stereo similarities and differences between channels which gets encoded into each recording and remains constant throughout the recording. Information that is specific to the recording arrangement, and additionally to that specific recording arrangement in each specific recording situation. We should be able to use that to our advantage.

How can we extract that information and use it to filter the replacement cross-faded content so that its no longer just a mono copy of an alternate good channel, but rather is imbued with whatever typical stereo difference information is typical to that particular recording setup.. and more specifically to that particular recording setup in that particular recording situation?

With a four channels (or more) rather than just two, the encoded information about the recording array and the specific situation in which it is used becomes far more robust. As channel count increases arithmetically, the cross-relationships between each chanel and channel groupings increases geometrically.

I dream of a program which analyzes a recording made with a static, unchanging multichannel microphone array, determines useful things about the cross-correlations between all channels, and uses that to synthesize a convincing missing channel from the channels that remain. Some day..

TheJez · « **Reply #63 on:** November 19, 2024, 02:14:44 AM »

Quote from: Gutbucket on November 18, 2024, 06:22:03 PM

I dream of a program which analyzes a recording made with a static, unchanging multichannel microphone array, determines useful things about the cross-correlations between all channels, and uses that to synthesize a convincing missing channel from the channels that remain. Some day..

These are some interesting thoughts. Your scenario 1 crossed my mind as well. I think it should be possible to auto-sync these recordings. I guess most success can be achieved when both recorders are recording the exact same microphone or SBD outputs. If the two recorders use a different set of microphones, then I think I wouldn't want to be auto-switching between the sound of the two recorders.

The other scenarios... I'd think are extremely difficult. There are so many aspects that would determine 'the content of the missing channel', and indeed some of them could be derived from analyzing the channels that are available and 'the missing channel when it was still there', but some of them are impossible to determine. E.g. with a two-channel recording, the source location (distance,direction) of some particular sound combined with the frequencies in the sound and the room acoustics have a huge influence on how that sound is picked up by the two microphones (phase, amplitude). Now if you'd take out one channel, there is no way to determine what the distance/direction of a specific sound is, so there is also no way to determine how it would have sounded on the missing channel. And then imagine that the recordings we make are a mix of countless sounds from all kinds of amplitudes, phases and frequency compositions from all directions that affect each other and bounce around before hitting the microphone membrames.
I'd guess the dreams you have cannot be solved by classic signal processing, but maybe with the aid of AI it may be possible to fill short gaps caused by dropouts etc. with more ore less realistic fillings. Waaay out of my league! I think it would be better to invest in better cabling to prevent the dropouts in the first place

One other thing that crossed my mind while working on this program:
I think I found a way to reduce a recorder self noise by any desired amount. Might apply for patent for that! Or sell the idea to the highest bidder! Zoom, Tascam, SD, Sony, Roland, you're all invited to contact me!

TheJez · « **Reply #64 on:** November 19, 2024, 02:57:36 AM »

Quote from: kuba e on November 18, 2024, 02:07:16 PM

Sorry I'm only joining now. TheJez, thank you for the very interesting posts. I like it a lot. You have my respect, I really like what you came up with and implemented.

Thanks Kuba e! I must confess I am a bit proud of what I achieved. Just a pitty I didn't do this when the first recorders with a safety track feature came out. Nowadays the 32bit fp multi-adc recorders have become the standard, making the safety track feature superfluous...

Gutbucket · « **Reply #65 on:** November 19, 2024, 12:50:59 PM »

Thanks for the reply. Won't dwell on this excessively, but do want to flesh it out a bit more before I let it go, as I think this is the first time I've actually put it down in words (have verbally discussed the concept with a taper with an acoustics PHD)

Quote from: TheJez on November 19, 2024, 02:14:44 AM

These are some interesting thoughts. Your scenario 1 crossed my mind as well. I think it should be possible to auto-sync these recordings. I guess most success can be achieved when both recorders are recording the exact same microphone or SBD outputs. If the two recorders use a different set of microphones, then I think I wouldn't want to be auto-switching between the sound of the two recorders.

The biggest challenge for scenario 1 is likely to be achieving sufficient phase-locked synchronization between the two files sets recorded to two different non-clock linked recorders, which is a non-issue when the safety is track recorded on the same recorder as the primary track. Good sync in human perceptual terms need only be achieved to within some 10's of samples, depending on sample rate, in some cases an order of magnitude looser than that. In scenario 2, auto-switching between different microphones becomes the essence of the thing. That's the more interesting one I want to get into a bit more..

Quote

The other scenarios... I'd think are extremely difficult. There are so many aspects that would determine 'the content of the missing channel', and indeed some of them could be derived from analyzing the channels that are available and 'the missing channel when it was still there', but some of them are impossible to determine. E.g. with a two-channel recording, the source location (distance,direction) of some particular sound combined with the frequencies in the sound and the room acoustics have a huge influence on how that sound is picked up by the two microphones (phase, amplitude). Now if you'd take out one channel, there is no way to determine what the distance/direction of a specific sound is, so there is also no way to determine how it would have sounded on the missing channel. And then imagine that the recordings we make are a mix of countless sounds from all kinds of amplitudes, phases and frequency compositions from all directions that affect each other and bounce around before hitting the microphone membrames.
I'd guess the dreams you have cannot be solved by classic signal processing, but maybe with the aid of AI it may be possible to fill short gaps caused by dropouts etc. with more ore less realistic fillings.

Yes quite challenging to do perfectly. But we don't need perfection in the emulated channel, just some improvement over the use of a straight copy of one of the remaining channels, and the entire endeavor is made easier by having a lot of perceptual leeway available. Some of it would be relatively easy, other aspects considerably more difficult. It could/can be pursued to various degrees.. making it ripe for continuing further improvement via upgraded releases. AI may be quite useful for some more advanced approaches, but isn't as necessary as you might think.

As I see it are three different sources from which we can draw useful information:

The first is the physical geometry of the microphone configuration. We needn't reference the actual real-world configuration at all for this. Can be used to easily figure things like the upper limits of the phase, level, and timing differences between channels across specific frequency ranges. Also simple geometric info like how those differences manifest in regard to the direction (or non-direction) of sound arrival - direct sound arrival from straight ahead will produce no phase, level or timing differences; while sound arrival from maximally off the stereo axis will never exceed some maximal timing difference determined by the spacing between mics; and will similarly never exceed some maximum value of phase and level difference, which will vary specifically by frequency. All that is essentially no different than the inputs to long available virtual microphone configuration visualizers such as the Sengpielaudio https://sengpielaudio.com/HejiaE.htm and Shoeps Image Assistant visualizers http://ima.schoeps.de/. Schoeps Image Assistant even graphs the range of timing, level, and diffuse field correlation by frequency, based entirely on the geometry of the microphone configuration. It even includes an auralization routine that supposedly lets the user listen to an emulation of the microphone configuration while moving the source position around, but I've never gotten the auralizer to work.

The second is the extension of that to measurements of the actual real-world system. Determining more specifically the relationship between the channel to be replaced with the other available channels when the channel to be replaced is included in the set and working correctly. Most easily determined by recording test signals with the properly working microphone arrangement - say, a fully diffuse reverberant response, the response of direct arrival from a few specific directions, stuff like that. But it could also use an existing recording as stimulus in place of or in addition to such isolated test signals. Bob McCarthy figured out how to do that 4 decades ago using classic signal processing in collaboration with John Meyer and Don Pearson when developing Meyer SIM (Source Independant Measuring) which uses live sound itself as the test signal. Many offsprings of that tech since then based in classic signal processing.

The third is the relationship between all the remaining good channels with the damaged one is absent from the set, which frequently includes a channel that symmetrically mirrors the missing one.

Izotope devs are you lurking? A stereo/multichannel microphone-array replacement channel tool would be a welcome tool Izotope RX! I'd buy it.

Gutbucket · « **Reply #66 on:** November 19, 2024, 12:56:31 PM »

^ philosophically, it would basically be a vastly improved "mono to pseudo-stereo" tool

TheJez · « **Reply #67 on:** November 20, 2024, 06:10:24 AM »

Quote from: Gutbucket on November 19, 2024, 12:50:59 PM

Thanks for the reply. Won't dwell on this excessively, but do want to flesh it out a bit more before I let it go, as I think this is the first time I've actually put it down in words (have verbally discussed the concept with a taper with an acoustics PHD)

Quote from: TheJez on November 19, 2024, 02:14:44 AM
These are some interesting thoughts. Your scenario 1 crossed my mind as well. I think it should be possible to auto-sync these recordings. I guess most success can be achieved when both recorders are recording the exact same microphone or SBD outputs. If the two recorders use a different set of microphones, then I think I wouldn't want to be auto-switching between the sound of the two recorders.

The biggest challenge for scenario 1 is likely to be achieving sufficient phase-locked synchronization between the two files sets recorded to two different non-clock linked recorders, which is a non-issue when the safety is track recorded on the same recorder as the primary track. Good sync in human perceptual terms need only be achieved to within some 10's of samples, depending on sample rate, in some cases an order of magnitude looser than that. In scenario 2, auto-switching between different microphones becomes the essence of the thing. That's the more interesting one I want to get into a bit more..

Quote
The other scenarios... I'd think are extremely difficult. There are so many aspects that would determine 'the content of the missing channel', and indeed some of them could be derived from analyzing the channels that are available and 'the missing channel when it was still there', but some of them are impossible to determine. E.g. with a two-channel recording, the source location (distance,direction) of some particular sound combined with the frequencies in the sound and the room acoustics have a huge influence on how that sound is picked up by the two microphones (phase, amplitude). Now if you'd take out one channel, there is no way to determine what the distance/direction of a specific sound is, so there is also no way to determine how it would have sounded on the missing channel. And then imagine that the recordings we make are a mix of countless sounds from all kinds of amplitudes, phases and frequency compositions from all directions that affect each other and bounce around before hitting the microphone membrames.
I'd guess the dreams you have cannot be solved by classic signal processing, but maybe with the aid of AI it may be possible to fill short gaps caused by dropouts etc. with more ore less realistic fillings.

Yes quite challenging to do perfectly. But we don't need perfection in the emulated channel, just some improvement over the use of a straight copy of one of the remaining channels, and the entire endeavor is made easier by having a lot of perceptual leeway available. Some of it would be relatively easy, other aspects considerably more difficult. It could/can be pursued to various degrees.. making it ripe for continuing further improvement via upgraded releases. AI may be quite useful for some more advanced approaches, but isn't as necessary as you might think.

As I see it are three different sources from which we can draw useful information:

The first is the physical geometry of the microphone configuration. We needn't reference the actual real-world configuration at all for this. Can be used to easily figure things like the upper limits of the phase, level, and timing differences between channels across specific frequency ranges. Also simple geometric info like how those differences manifest in regard to the direction (or non-direction) of sound arrival - direct sound arrival from straight ahead will produce no phase, level or timing differences; while sound arrival from maximally off the stereo axis will never exceed some maximal timing difference determined by the spacing between mics; and will similarly never exceed some maximum value of phase and level difference, which will vary specifically by frequency. All that is essentially no different than the inputs to long available virtual microphone configuration visualizers such as the Sengpielaudio https://sengpielaudio.com/HejiaE.htm and Shoeps Image Assistant visualizers http://ima.schoeps.de/. Schoeps Image Assistant even graphs the range of timing, level, and diffuse field correlation by frequency, based entirely on the geometry of the microphone configuration. It even includes an auralization routine that supposedly lets the user listen to an emulation of the microphone configuration while moving the source position around, but I've never gotten the auralizer to work.

The second is the extension of that to measurements of the actual real-world system. Determining more specifically the relationship between the channel to be replaced with the other available channels when the channel to be replaced is included in the set and working correctly. Most easily determined by recording test signals with the properly working microphone arrangement - say, a fully diffuse reverberant response, the response of direct arrival from a few specific directions, stuff like that. But it could also use an existing recording as stimulus in place of or in addition to such isolated test signals. Bob McCarthy figured out how to do that 4 decades ago using classic signal processing in collaboration with John Meyer and Don Pearson when developing Meyer SIM (Source Independant Measuring) which uses live sound itself as the test signal. Many offsprings of that tech since then based in classic signal processing.

The third is the relationship between all the remaining good channels with the damaged one is absent from the set, which frequently includes a channel that symmetrically mirrors the missing one.

Izotope devs are you lurking? A stereo/multichannel microphone-array replacement channel tool would be a welcome tool Izotope RX! I'd buy it.

Thanks for elaborating. Although I understand your general ideas, I'm afraid it is a bit over my head to really properly judge or challenge them. I guess the taper with an acoustics PHD is indeed a much better candidate to discuss these matters, and the iZotope people seem much better candidates to implement them!

Gutbucket · « **Reply #68 on:** November 20, 2024, 09:12:34 AM »

Thanks for the ear!

TheJez · « **Reply #69 on:** December 03, 2024, 04:55:53 AM »

Hi everybody,
It's been a bit quiet for a while, which doesn't mean I put this project to rest.
I've made some updates to the program, leaving the basic algorithms intact:
- Removed some real-time eye-candy like the moving waveforms and frequency histograms
- Now made it a one-button process, instead of having to click a button for each step
- Re-structured the calculations. Previously, if the file had e.g. 1.000.000 samples and the sliding window was 128 samples wide, then many calculations were done 128.000.000 times. So many identical calculations were done 128 times while only one was really needed.
- Added option to normalize the end result
- Added support for the SPDR recorder: If you open the 4-track file created by the 'split-gain feature', the program will offer to split this into two stereo files. So you won't have to split the 4-track file yourself.
- Created a 'review tab', where the results of the merging process can be reviewed.
- Added an 'about tab' with a short description of the program and a nice program logo

This all resulted in a program that's about 10x faster than the prototype. E.g. 34 minutes of audio now takes ~50 seconds to process, including normalization. (Obviously this depends on the speed of your CPU and disk drive, so YMMV)

I've attached some screenshots below:
- Main tab: Buttons to open the main and safety file and the 'start button'. Also the checkbox to enable normalizing the merged file.
- Review tab: On the right side there is a list of interesting events per channel: Potential Clipping Left/Right, Fade In Safety Left/Right, Fade Out Safety Left/Right. By clicking these events, the waveforms of the clicked event is shown. You can move forward/backward in the waveform by using the buttons.
- The about tab containing a short description of the program

I will try to send the program out to some people for review/testing...

Niels · « **Reply #70 on:** December 03, 2024, 05:13:23 AM »

cool!

kuba e · « **Reply #71 on:** December 03, 2024, 06:52:57 AM »

TheJez, Congratulation! It looks beautiful.

hedfro · « **Reply #72 on:** December 04, 2024, 01:56:11 AM »

I can't wait to test this. well done.

TheJez · « **Reply #73 on:** December 04, 2024, 09:29:36 AM »

There seem to be quite some Mac users here on the TS forum! Unfortunately I have totally no experience with software deveopment for Mac and the program was made on/for Windows machines. However, I'm trying to get a (virtual) Mac machine so I can attempt to build it for Mac's too... No idea how things work out, I'll keep you posted...

adrianb · « **Reply #74 on:** December 04, 2024, 10:17:21 AM »

I’m going to see a gig (Echobelly) tonight and think I will deliberately set my R07 a little hot just so I can test the program.

Author Topic: Roland R-07 32-bit floating point! (Read 95115 times)

kuba e

Re: Roland R-07 32-bit floating point!

Gutbucket

Re: Roland R-07 32-bit floating point!

Gutbucket

Re: Roland R-07 32-bit floating point!

TheJez

Re: Roland R-07 32-bit floating point!

TheJez

Re: Roland R-07 32-bit floating point!

Gutbucket

Re: Roland R-07 32-bit floating point!

Gutbucket

Re: Roland R-07 32-bit floating point!

TheJez

Re: Roland R-07 32-bit floating point!

Gutbucket

Re: Roland R-07 32-bit floating point!

TheJez

Re: Roland R-07 32-bit floating point!

Niels

Re: Roland R-07 32-bit floating point!

kuba e

Re: Roland R-07 32-bit floating point!

hedfro

Re: Roland R-07 32-bit floating point!

TheJez

Re: Roland R-07 32-bit floating point!

adrianb

Re: Roland R-07 32-bit floating point!