Guesswork here - but I would expect the mid mic to go in the first channel, the side into the second, and the pans to the centre. Presumably that will result in the mid and side mics being recorded "as is" for later processing in your DAW with something like Voxengo MSED vst, but while monitoring you would hear the 'decoded' sound.
I agree, mid-mic is likely channel 1 and the side is ch2. This way, when it does the decode, the left and right will match what you're pointing at. (positive impulse = left)
As for the panning, I don't have a great answer. I'd be tempted to hard-pan them to keep them seperate because when you drop everything to the center pre-decode, it should create the left channel only, and do it post-decode and you end up with just the mid.
When in doubt, you could always record discrete tracks and decode in post-production like peter mentions. best of luck.