I'm confused by this discussion. My understanding of word length (bit depth) is that it's a measure of resolution, in other words how many bits of data are used to represent each frequency sample. More is always better in the case of bit depth.
It's generally accepted that a live symphony performance has a useful dynamic range of 50db. The established range for PCM for each bit is 6db, which gives us the theoretical 96db dynamic range possible at 16 bit resolution and 144db dynamic range for 24 bit. On the face it would seem even 16bit is uncessary, let alone 24bit to capture the entire range of what is being presented.
What is often missed in these discussions is that when a recording of music with 50db of dynamic range is made using 16 bit PCM with peaks at 0db, any notes that are at -50db will be represented by only 8 bits, where the same music recorded using 24 bit PCM with peaks at 0db will use 16 bits to represent the same notes that are down -50db. Analog tape, regardless of type, has musical information that decays into what is called "tape hiss". Dynamic range is defined as the range between the point the the musical information is replaced by noise, up to the point that results in harmonic distortion. That range for PCM is more rigid, since there is no buffer at the top from tape saturation, and when the noise floor is reached, the noise totally replaces the musical information. When recording what is on an analog tape, word length determines how much detail can be detected at the bottom of the analog tape's dynamic range during it's much longer transition, and why it's going to be more faithfully captured with 24bit PCM.
The bottom line is that if we accept that a cassette tape has only 50db of dynamic range, a digital recording of that tape peaking at 0db using 16bit PCM will result in a resolution with an average word length of 12 bits. A digital recording of that same tape peaking at 0db using 24bit PCM will result in a resolution with an average word length of 20 bits. That is a significant difference in the ratio of bits per sample, and with a high quality playback system, there are plenty of adults that will be able to hear that difference in terms of detail. If you want to hear a difference in resolution between bit depths in PCM for yourself it's easy. Make a recording with peaks at 0db, then drop the gain to where the max peaks are only -24db, then add back 24db of gain in post and you will have an example of the difference in resolution between 20bit PCM and 24bit PCM and can hear this for yourself.
What is difficult to hear, and I say almost impossible for an adult as they age, is the difference between sampling at 96khz and sampling at 48khz. Between the two, bit depth and sampling frequency, bit depth is more important, IMO, but that is a whole other can of worms I don't intend on opening up here.
Hope this helps.