Hey man, my degree is in mathematics. The math to calculate the "drift factor" is incredibly simple and easy to derive just knowing what you need to do. You need to think a little bit past your first grade math to see what the real problem here is.
Here's a practical situation.
Source 1 - 1 hr exactly @ 48kHz (a total of 172,800,000 samples)
Source 2 - also recorded at 48kHz, but let's say the first sample of each matches perfectly, and the last sample of source 1 matches up to the 172,801,000 sample of source 2. So they drift by 1,000 samples exactly in the hour (source 1's concept of an hour).
To resample source 2 to be exactly 172,800,000 samples as well you'll need to use a factor of 172800000/172801000.
Here's the first lot of digits of that fraction:
0.99999421299645256682542346398458
From here it is a precision problem
If you correct with a factor of 0.9999942, source 2 will end up with 172,799,998 samples total. After correction there is a 2 sample drift which is definitely acceptable.
Using a factor of 0.999994, source 2 will end up with 172,799,963 samples total. Overshot the mark by 37 samples, but that's still pretty good.
Using a factor of 0.99999, source 2 will end up with 172,799,272 samples. Now you've overshot it by 722 samples which is nearly as bad as before you started, not to mention you've thrown out all the original data in source 2 and interpolated completely new samples.
I don't remember how many digits CEP gives you to work with, but in my experience it is not enough for the level of precision needed if you're trying to fully sync long recordings.
This is why I recommend Sony Vegas:
a. it has "stretch event" functionality (ctr + click/drag the beginning or end of a track -- no need to even determine the number of samples by which the two sources drift).
b. it has non-destructive editing the whole way through, you only need to resample and render to a new file AFTER you've perfectly matched the two sources.
dmonkey, just cut off the ends while aligning your sources (anything you don't need to see), and then restore them after you're done stretching.
1. match up an event early in the recordings, and make a split here on both sources.
2. match up an event late in the recordings (yes, move one of the files in the timeline), and make a split here on both sources.
3. stretch one file to match the other
4. at the start and end of each recording, pull them back out to restore the segments that had been cut while aligning
I do think simply dropping 1000 samples evenly spaced throughout the long source may also be a viable method, as long as the pitch difference between the two sources is small enough to be imperceptible. Has anyone tried this?