Just a quick update regarding the initial intension of this topic and the developments so far:
My initial thought was to use a weighted average of the samples of the main track and the safety track, where a louder sample would get a higher percentage of the safety track sample and a softer sample would get a higher percentage of the main track sample.
I had quickly abandoned this idea, as it would mean that each output sample would get 'poluted' with data from the noisier safety track, even when not necessary at all.
Being inspired by the SD patent I couldn't put this fully to rest and continued to work on a 'safety track merger'. My method is a bit different than the method described in the patent but addresses the same issues. I've made quite some progress so far. The program currently works in two steps: The 'Preparation Phase' and the 'Merge Phase'.
In the 'preparation phase'
- The program reads the main and safety tracks, converts them to 32bit floating point, throws away all info below 10Hz to get rid of any DC component (including the 'varying DC' also addressed in the SD patent).
- It counts the number of clipping samples in the main track (for left and right channel). If there are no clipping samples, there is no need to merge. A list of clipping samples (sample number and timestamp) can be exported to the clipboard for evaluation purposes)
In the 'merge phase':
- A 'window' (of currently 64 samples) slides sample-by-sample over the main and safety track.
- For each window, the gain factor between main and safety track is determined as accurately as possible and the safety track is amplified by this gain factor to make it match the main track window.
- A 'difference' between main and amplified safety window is calculated. For a non-clipping window, this difference will be small. For a window including clipping (or other artifacts e.g. caused by analog stage overloading), the difference will be bigger.
- When main and safety windows are 'similar enough', the main window is added to the output. If they are 'too different' (i.e. there is clipping or other artifacts), the safety window is added to the output.
- Because of the 'sliding window' principle, a slow (64 sample) cross-fade between main and safety track is applied when switching between the two. So there is no hard transition from sample-to-sample. This should mitigate transition effects caused by (hopefully very low remaining ) inaccuracies of the gain factor calculation.
All is working, except for the part where the main or safety window are added to the output. Still working on that...
Thanks for your interest

I'll keep you posted...