^^^^ this. "X/Y" refers only to the coincident placement of two directional microphones; it doesn't specify which directional pattern or what angle is between their main axes. Of course in practice, some patterns and angles make much more sense than others. Coincident wide cardioids angled 10º apart won't get you much of a stereo image no matter where they're placed!
For the center of a stereo image to be in proportion to its sides, one basic principle is to angle the microphones apart so that their pickup patterns overlap right where each one has 1/2 its on-axis sensitivity. Then you'd aim the pair of mikes toward the center of the sound source(s) symmetrically, i.e. with the point of overlap in their patterns--their mutual axis as a stereo pair--facing exactly forward. If you work that out geometrically, however, using the 1 + cos(θ) formula, it turns out that cardioid is the least directive pattern that can achieve that at all. And to get a pair of cardioid patterns to overlap at their half-power points, you'd need to set the two mikes up back-to-back (180º), which is problematic in other ways (especially if the microphones have narrower patterns at high frequencies).
90º between a pair of cardioids is very weak sauce, however, unless you're recording an event where your mikes are surrounded by what you're trying to record. And in fact I think that's where the old 90º "cookbook recipe" for cardioids comes from: The large majority of "one-point stereo microphones" are sold for recording business meetings, not musical or stage performances, by someone who's at the meeting and who places the mike in the middle of the group or at their own desk or whatever. When you want to record a musical or other stage performance, the microphone is usually at some distance from all the sound sources, and the goal is to discriminate left vs. right among a
relatively narrow angle of sound sources as "seen" from that distance. To some extent you're recording the entire space that the events occur in--but only a limited range of angles is "in front". That range of angles needs special attention in recording because of the way two-channel stereo generally gets played back in people's homes.
From a 1970s perspective, when consumer recording equipment started to be mass-produced in Japan and sold in the U.S. and western Europe, if you went to your local Radio Shack or your local Sony or Panasonic dealer, most of what you'd see among mass-produced merchandise was designed for the "business meeting"-type application, including stereo microphones with two cardioids angled at 90º. (Recorders with built-in microphones are still made and sold with this configuration.) But that application has different miking requirements from what we here mostly do (also w/r/t flat low frequency response, which is undesirable for speech pickup; also peaked or otherwise elevated response in the upper midrange and higher, which can be quite useful for speech intelligibility but is tiresome for music recording). The icing on the cake, back then when most people had never used microphones or recorders before, was that the brighter something sounded, the more "high fidelity" most people thought it was--so (especially) the Japanese manufacturers competed on that basis for a long while.
The "first-order" directional patterns can be arranged in a spectrum from omni to bidirectional (figure-8), based on their physical principle of operation. On that spectrum cardioid is right in the middle--it's what you'd get if you mixed the output from a perfect omni with a perfect figure-8 50/50 if both mikes had exactly identical sensitivity. Many switchable-pattern microphones have exactly that selection of three patterns (e.g. Neumann U 67 and U 87), and in some types historically there has been an omni capsule and a figure-8 capsule, with the cardioid pattern synthesized by combining the two capsules' outputs (e.g. Schoeps M 201).
You can chop that pattern spectrum up further, too: "wide cardioid" would fall somewhere between omni and cardioid (there's no specific technical formula); supercardioid would fall between cardioid and figure-8 but in that case there is a specific trigonometric formula (see chart below), and hypercardioid would fall between supercardioid and figure-8 (again, there's a specific trigonometric definition for it). Accordingly, some multi-pattern microphones (e.g. Neumann TLM 170 and U 89) offer five selectable patterns: omni, "wide cardioid", cardioid, super- or hypercardioid (most often something between the two), and figure-8. Others go further and offer nine gradations, or even a fully continuous range of patterns. It's all just different mixtures, although the classic Neumann approach (developed by Braunmühl and Weber ca. 1935) uses two back-to-back cardioids rather than an omni and a figure-8; with a little math it can be shown that the outcome is the same.
There's a professor named Michael Williams, a very dear fellow who has done all the math and tested it all in practice, who has published charts and graphs showing what you get with all sorts of patterns and angles and (violating the X/Y norm, but quite useful for sonic reasons) distances between microphones. Helmut Wittek, who these days is the co-CEO of Schoeps microphones, also has a Web site with a nifty calculator for such things, on
https://hauptmikrofon.de/stereo-surround/image-assistant .
--best regards
P.S. added later: Despite all the math, these formulas aren't the sole determinants of all outcomes. For one thing, microphones don't generally have identical directional patterns across the entire audio frequency spectrum (a phenomenon that's not random; it's a whole other discussion, though). The listening/playback setup is an enormous factor as well. Headphones vs. loudspeakers are obviously different realms. The angles between a listener and a pair of loudspeakers make a huge difference, too.