I hope this doesn't come as a blow to anyone's ego, but the data format used in 32-bit audio recording was defined by the IEEE back in the 1980s for other purposes, back before it was common for digitized audio to be processed on computers. There were no readily available hardware audio interfaces for that purpose, certainly not at the consumer level, and personal computers were scarcely powerful enough to be worth using for processing audio anyway. Who wants to wait minutes each time just to hear the effect of a simple editing decision? Plus hard disk space back then was limited and very expensive; in the mid-80s a hard drive with enough space to hold the contents of one audio CD would have cost something like $2,000 in today's dollars.
But over time the IEEE floating-point formats (plural--the one used in audio is one of a set) became an ISO standard as well, and Intel and other CPU manufacturers supported it in the form of native instructions of a peculiar sort: As you ran a program, if it contained those instructions, their execution would be farmed out to a physically separate "numeric coprocessor" that your main CPU could talk to IF you had paid the extra bucks for one. The Intel 80286 CPU in your IBM PC AT would talk to an (optional) 80287 numeric co-processor; an i387 co-processor was available for the i386. The story with the 486 is murkier and beside the (fixed or floating) point. If, however, like most customers other than science labs, you didn't splurge for the co-processor and wanted to run software that relied on having one, those instructions could be emulated in library routines. Those were of course much, much slower than dedicated hardware. On the other hand the internal coordination of the CPU with the co-processor also carried a certain amount of overhead. So ever since the Pentium hit the fan, numeric coprocessing capability has been integrated into all CPUs that you would ever find in a PC of any kind.
If a DAW operator is mixing many stems, trying out different EQ, reverb, levels and panning settings on each (some of which may be changing dynamically under software control), they don't want to have to wait minutes or hours for each proposed adjustment to "render" before they can hear what it will sound like. But they also want to leave the original stems unchanged so that they can revert any changes they've made, or revisit a decision later on. The more operations can be done in real time, the more flexible the software becomes, so on a dedicated workstation this can demand very large amounts of processing with multiple CPUs running in parallel. It makes anyone who remembers the original IBM PC with its 4.77 MHz 8088 (not even an 8086!) want to weep.