One fundamental thing is that what you describe will act more like a two channel stereo microphone arrangement rather than a multichannel stereo microphone arrangement.
In terms of microphone array geometry, positioning each cardioid microphone so as to be as coincident as possible with the omni on either side (and recording all four channels separably) essentially creates a two channel stereo microphone configuration, but one which provides the ability to derive any pattern between omni and cardioid afterward, based upon how much cardioid verses how much omni is used in the mix. 100% omni with no cardioid = omni, 100% cardioid with no omni = cardioid, and equal measure of both = subcardioid.
How well that actually works will depend on how closely coincident the diaphragms of the two microphones on each side are placed, and how close the phase and frequency responses of each microphone are to each other. In a real world situation, a less than perfect coincident alignment of the microphones (inevitable to some degree) will produce some phase interaction at very short wavelengths that corresponds to the (small) offset distance between diaphragms. That is likely to be heard as either a somewhat dull or more shimmery high frequency treble when both pairs are used together at a near equal mix ratio. In addition to that, inherent phase and response differences between the cardioid and omni will alter the shape of the resulting pattern somewhat in the range where they differ, in addition to altering tonal response somewhat as the ratio between the two pairs is adjusted.
That first part is all about how close the two microphones can be positioned to each other on each side. In regard to the second part, just be aware that there will be some change in frequency response as well as some minor pattern alteration with frequency as the ratio between the two are adjusted in the mix. To reduce that influence you could EQ each pair so that they have a very similar response prior to combining them.
Because the two microphones on each side being mixed together are pointed the same way [edit- and mostly even if not, since one is an omni], the angle of the virtual pickup pattern produced as a result of the mix of the two will not differ from the direction the cardioid was pointed. You just morph between omni and cardioid patterns while retaining the original angle of the cardioid. [more edit- if you find the mix of the two sounds a bit more "dull" in the high frequency region than either alone, that may be due to some destructive cancellation at high frequencies from the microphone capsules being close yet not perfectly coincident. In that case it may help to either cut the highs in one and boost in the other to compensate, thus eliminating the interference.. or to point the two microphones differently from each other while keeping them as close to each other as possible]
As always, the proof of the pudding is in the eating. You really need to try it and listen to decide if its achieving what you want. This approach attempts to minimize potential problems by placing the additional microphones as close as possible to the first pair, while the opposite approach of spacing them differently attempts to minimize potential problems by getting the additional microphones sufficiently far enough apart from each other (but for other reasons not overly far). The decision of which is better will come down to whether the sound of the interaction of the two differently spaced pairs dancing with each other appeals to your sense of what the organ should sound like in the recording or not. Forget about time of arrival and all the other theoretical stuff when listening back and determining which approach sounds most right.