The underlying question is- what level of sync is good enough?
In some modes, Timecode can sync clocks to frame-rate level (somewhere around 30 times per second), but not to sample-rate level (which at 48,000 times per second is a couple orders of magnitude more precise). Agreed that a wordclock / sample-level degree of sync is ideally what we'd like to achieve, and is arguably what's necessary across the various microphones of a phase-correlated multichannel stereo array. But how close do we really need and can we achieve that with timecode?
First off, we can allow for somewhat less sync tolerance between sources that are not highly phase-correlated, such as AUD mics + SBD feed, or a pair or room mics that placed separately from the main pair. Secondly, clock chips these days tend to be much more accurate than they used to be, running closer together and drifting less. In the timecode mode where sync is confirmed every 30th of a second or so, that's probably more than sufficient. And in the timecode mode where sync is only jam sync'd at the start, after which each clock runs freely, identical modern clock chips in identical modern recorders are more likely than they used to be to run close enough over the course of one live performance set.
For those reasons I suspect a timecode frame-level of sync will probably be sufficient for most taper use. And an initial clock jam may be enough. But we need to try it to really see.
