A hub for two mikes wouldn't solve the problem of their having unsynchronized word clocks. You'd get one data stream from one mike and another, slightly longer or shorter (in terms of the number of samples in any given span of time) data stream from the other. So then you'd have to convert the sample rate of one or both streams to match each other.
That said, yeah, it makes a big difference whether you record mono or stereo, even for largely diffuse sound. The sound field isn't the same at any two points in space, and our brains are very talented at making something out of even very tiny differences between left and right "inputs" (ears). Those differences may not aid in the localization of sound sources, but they still make a (potentially) big difference to our subjective sense of the space that we're in, which is a huge part of the experience of hearing any live event.
Plus, the loudspeakers that pour the sound into the air in a venue are often quite sharply directional at the frequencies that matter (midrange, upper midrange). If they weren't, the result would be utter cacophany. So the effect of distance may be considerable, but it isn't the same as with direct acoustical sources that radiate more evenly in all directions.
--best regards