^^ I think we'll just have to agree to disagree, as we have a fundamental philosophical divide with respect to the tautness of the hypothesis and it's relationship to the meaningfulness and interpretability of the results...
Also, the null hypothesis in this case results in exactly what you said, ‘nobody would ever score higher (or lower) than 50% in a sufficiently large number of trials.’ To clarify, suppose randomly you called the correct answer A half the time and randomly you called it B the other half, but that there is no way anyone can distinguish between them. Then it doesn’t matter how someone answers, it still converges on 50% correct.
I didn't state it very well for the case of the null being true, but, even when the null holds, the intra-individual results are not truly independent (i.e. not coin flips). Even in simple tests like this, there are a wide range of subtle individual biases and others introduced by the experimental design. So that non-independence is a factor in analyses like these and should be accounted for; this is generally difficult to do at the meta-analysis level (although if you had the individual results for all of the studies, you could do it no problem with a linear mixed effects model), but it does complicate the interpretation of the results.
The editorial staff of the journal are listed at http://www.aes.org/journal/masthead.cfm . They have a much larger pool of reviewers that they pick from, and also use outside experts. I think they aim for a minimum of three reviews per paper. That said, its always a struggle (as is the case for many journals) to maintain a talented and diverse pool of reviewers, and its hard to find just the right outside experts. I'm sure that they would welcome more potential reviewers.
Interesting. Thanks. At least in theory, it works a little differently in my field, in that anyone is a potential reviewer based on relevant experience. In reality, editors usually have some "go to" people, though.