So long as the levels are set correctly, there is no difference to the amplifier whether its incoming analog signal is from a mic or previously recorded source
Which I think means that there would indeed be no major objection to my simple suggestion of using a mic-level pre-recorded source (eg CD player into Mackie mixer with mic-level output into recorders under test) as a test bed. Of course you'd have the CD player's imperfections and the Mackie imperfections being heard, but they'd be heard on both recordings. Must get around to that simple R-44 vs H2 test I promised several pages back!
The only time I did any kind of on-site comparison of kit was when I recorder one half of a classical chamber concert using a Sennheiser MS pair into a Motu Traveler (who preamps seem to be rated as quite good) into a laptop via firewire, and in the second half I substituted a Phonic firewire mixer whose preamps have no street cred at all - but personally I can't hear any great problem with them.
Listening afterwards, I couldn't pin down any significant difference between the two concert halves - in fact now I think I've forgotten which was which and certainly can't say that one is better than the other (and is therefore presumably the Traveler). That included listening to the quality of background noise between movements as well as the music and the applause. Of course they performed different works in each half, and so you can't compare like with like, but if there was a significant difference, you'd still hear it on such a test.
Of course it could be that there were differences and that I'm too deaf to hear them. But that particular test left me quite happy with the humble Phonic mixer and with no desire to purchase the Traveler instead.
Maybe that would be a reasonable and practical way of testing under real world conditions. Simply record one half with one set of kit, and the second with the comparison set, but use the same mics and don't move them. One could then listen to bits of each and see if there was anything that would enable one to say which bits came from which kit set, A or B. Then say whether you preferred A or B. But saying that A or B was the best (= the most accurate) would be another whole layer of complication, unless there was a gross inadequacy in one of them. How would you judge? One might sound very sexy but might not actually be true to the original experience through your ears.
Which reminds me of another bit of testing I did - this involved some tests of a Sennheiser MKH series mic, a Naiant mic, an LSD2, and a Gefell studio mic. They were each put up in from of a pair of Genelec monitors in a pro recording studio control room, recording the same prerecorded piece onto a multitrack via a Grace 8 channel preamp. so you could play back switching rapidly between the different versions by hitting each solo button in turn. What I now recall was that the LSD2 sounded preferable to the others until one compared it with the original recording - the LSD2 added something nice, but not something accurate. The Sennheiser sounded rather dark even though it's what I use for all my recordings in a main pair - but actually quite accurate as you would expect when compared to the source. There was remarkably little to choose between the Naiant and the Gefell - I mean, a difference, but not much when you compared the price.
However, I'd still not base a purchasing decision on such a test unless I had no choice. For whatever reason, there's no substitute for a real sound in a real acoustic for evaluating mics. But for preamps and recorders, using a pre-recorded source fed into the items under test at mic level seems to me to be a reasonable method.