Author Topic: 24Bit dither to 16bit vs. recording just 16bit (Read 9779 times)

bluntforcetrauma · « **on:** April 22, 2006, 09:59:57 PM »

Hello,

If i record 24 bit then dither down to 16 bit-will the sound be better than just recording in 16 bit?

Does the 24bit dithered files have more enriched sound quality?

thanks

tfs8271 · « **Reply #1 on:** April 22, 2006, 10:49:08 PM »

I think I'm right on this.

No.
No (if you talking after dither down).

But you have a 24 bit recording which sounds sweet on the computer and on in DVD-A format.
Your 16 bit from dither 24 bit will sound the same as a 16 bit recording to the ear.

Correct me if I'm wrong.

nihilistic0 · « **Reply #2 on:** April 23, 2006, 08:25:18 PM »

SOmething that is recorded in 24bit and dithered down to 16bit will sound better, in theory, than a recording done simply in 16 bit

If you just wanna record 16 bit, its good to crank up the levels as high as you can without clipping to get the most dynamic range from it, which is not really necessary in a 24 bit recording

SparkE! · « **Reply #3 on:** April 24, 2006, 03:10:22 PM »

Quote from: nihilistic0 on April 23, 2006, 08:25:18 PM

SOmething that is recorded in 24bit and dithered down to 16bit will sound better, in theory, than a recording done simply in 16 bit

True, but only if your 24 bit recording was done with levels that are hotter than -48 db with respect to full scale AND you have a perfect 24 bit recorder. The fact is most 24 bit recorders only get you to 110 S/N which is only about 18 dB better than your typical 16 bit recorder that does 92 dB S/N. So, in the real world, your levels on the original 24 bit recording need to be within 18 dB of full scale for your 24 bit recording to dither down to a 16 bit recording with a full 92 dB S/N. (96 dB S/N is the theoretical max for a 16 bit recorder, but few 16 bit recorders do better than 92 dB.)

Chanher · « **Reply #4 on:** April 24, 2006, 04:02:16 PM »

aren't most (modern) a/d's 24-bit processors? whether the unit outputs a 16-bit or 24-bit signal, I thought that most of them use 24-bit a/d chips (or whatever they're called), so if that's true, then there is word-length reduction being performed no matter what. We know that the v3 and mini-me apply internal dither schemes that are known to sound very nice, but we don't know whether other boxes dither or truncate. can anyone correct me?

SparkE! · « **Reply #5 on:** April 24, 2006, 05:47:30 PM »

Quote from: Chanher on April 24, 2006, 04:02:16 PM

aren't most (modern) a/d's 24-bit processors? whether the unit outputs a 16-bit or 24-bit signal, I thought that most of them use 24-bit a/d chips (or whatever they're called), so if that's true, then there is word-length reduction being performed no matter what. We know that the v3 and mini-me apply internal dither schemes that are known to sound very nice, but we don't know whether other boxes dither or truncate. can anyone correct me?

When a 24 bit box outputs 16 bits per sample, it's acting like a 16 bit source and you can't get any better than 96 dB S/N, assuming that you have set the input level to cover the whole dynamic range of your A/D. If your input level is 10 dB too low, you can't do any better than 86 dB S/N. However, if you record a 24 bit recording at 10 dB below full scale, your 110 dB S/N (typical of most 24 bit boxes) is only reduced to 98 dB. So, if you normalize the 24 bit recording, then dither to 16 bits, you can get a 16 bit recording whose S/N is nearly 96 dB, which is better than you can normally do with a 16 bit recorder.

I'm not sure if the 16 bit output of a 24 bit recorder is normally dithered, whether it's truncated or whether each 24 bit sample is rounded to the nearest 16 bit quantity. I suspect, though, that most recorders simply truncate 24 bit samples to 16 bits unless the box also gives you control over the method used for dithering. My personal preference would be for them to round a 24 bit sample to the nearest 16 bit quantity because that would result in the least degradation to S/N, but so many people believe in dithering that it seems to be the most commonly accepted way of doing things. To me it doesn't make sense to intentionally add high frequency noise to a perfectly good recording, but that seems to be what most people want to do. (That's essentially what dithering does.)

mhibbs · « **Reply #6 on:** April 26, 2006, 05:46:42 PM »

Quote from: SparkE! on April 24, 2006, 05:47:30 PM

My personal preference would be for them to round a 24 bit sample to the nearest 16 bit quantity because that would result in the least degradation to S/N, but so many people believe in dithering that it seems to be the most commonly accepted way of doing things. To me it doesn't make sense to intentionally add high frequency noise to a perfectly good recording, but that seems to be what most people want to do. (That's essentially what dithering does.)

Well, the idea is that the noise is added in a range less sensitive to human hearing (or less audible in simple terms). So in theory, you gain resolution in the more audible ranges by dithering a 24bit signal down to 16bit. Which is why the old Apogee 16bit units that everyone loved used 18bit (ad500e) and 20bit (ad1000) chips and dithered to 16bit using uv22.

SparkE! · « **Reply #7 on:** April 27, 2006, 09:58:00 AM »

Quote from: mhibbs on April 26, 2006, 05:46:42 PM

Quote from: SparkE! on April 24, 2006, 05:47:30 PM
My personal preference would be for them to round a 24 bit sample to the nearest 16 bit quantity because that would result in the least degradation to S/N, but so many people believe in dithering that it seems to be the most commonly accepted way of doing things. To me it doesn't make sense to intentionally add high frequency noise to a perfectly good recording, but that seems to be what most people want to do. (That's essentially what dithering does.)

Well, the idea is that the noise is added in a range less sensitive to human hearing (or less audible in simple terms). So in theory, you gain resolution in the more audible ranges by dithering a 24bit signal down to 16bit. Which is why the old Apogee 16bit units that everyone loved used 18bit (ad500e) and 20bit (ad1000) chips and dithered to 16bit using uv22.

Wait a minute... in both cases I'm ending up with a 16 bit recording and neither is subject to missing codes. I have the whole dynamic range to work with in both cases. All dithering does is add noise that I supposedly can't hear. How does that give you an increase in resolution?

I can believe that this was the marketing approach for selling the concept of dithering, but I still fail to see why adding supposedly inaudible noise to a recording is supposed to help it. That's like trying to suggest that I should go back and mess up the least significant bit of all of my old 16 bit recordings by altering its value according to some probability density function. And that makes my recordings better? I doubt it. We'll spend hours and hours on our recordings to try to make sure that we copy them in a bit perfect fashion, yet we're supposed to believe that if we add high frequency noise to our recordings during the conversion to a lower bit rate format that it makes things better? If someone can tell me what's wrong with my logic, I'd love to hear it. Something in the back of my mind tells me that this many people can't be wrong... but then again I know how gullible the American public can be.

And please guys, I'm not saying that you ARE wrong. I'm just saying that I don't get it and I'd really like for someone to explain to me in a technical manner why dithering is desireable. At best, it seems to me that it could be argued that dithering is inconsequential (which seems unlikely since some of the golden ear set claim to be able to hear 20 kHz sounds).

MattD · « **Reply #8 on:** April 27, 2006, 01:12:30 PM »

Quote from: SparkE! on April 27, 2006, 09:58:00 AM

And please guys, I'm not saying that you ARE wrong. I'm just saying that I don't get it and I'd really like for someone to explain to me in a technical manner why dithering is desireable. At best, it seems to me that it could be argued that dithering is inconsequential (which seems unlikely since some of the golden ear set claim to be able to hear 20 kHz sounds).

Dithering is desireable because it's random (or pseudo-random) in nature. When you go from one signal to a lower-depth one, you create errors in the waveform because it is no longer equal to the original digital signal. This is inherent to the process, and there's nothing "bad" about these errors themselves. However, when you just truncate the signal, for example, you are creating an error that is predictable. Humans are very good at pattern recognition, and many can distinguish this pattern in audio.

The errors that occur are at the very bottom of the signal (LSB). Dithering blends this bit (this is where my knowledge gets fuzzy, pardon the pun, whether dithering exclusively targets the LSB) with random noise so that there is no longer a discernable pattern in the quantization errors that are inherent to bit depth reduction.

BayTaynt3d · « **Reply #9 on:** April 27, 2006, 02:31:27 PM »

I think there is a little mixing of dithering and noise shaping concepts in this conversation, but not sure...

MattD · « **Reply #10 on:** April 27, 2006, 02:50:55 PM »

Noise shaping = changing the frequency range of the noise so it's not truely random like white noise. This supposedly pushes the energy into a range that's theoretically beyond our hearing ability. There have been some not-so-successful algorithms for this that didn't push it high enough and sound lousy on the treble.

SparkE! · « **Reply #11 on:** April 27, 2006, 03:05:10 PM »

Quote from: MattD on April 27, 2006, 01:12:30 PM

Quote from: SparkE! on April 27, 2006, 09:58:00 AM
And please guys, I'm not saying that you ARE wrong. I'm just saying that I don't get it and I'd really like for someone to explain to me in a technical manner why dithering is desireable. At best, it seems to me that it could be argued that dithering is inconsequential (which seems unlikely since some of the golden ear set claim to be able to hear 20 kHz sounds).

Dithering is desireable because it's random (or pseudo-random) in nature. When you go from one signal to a lower-depth one, you create errors in the waveform because it is no longer equal to the original digital signal. This is inherent to the process, and there's nothing "bad" about these errors themselves. However, when you just truncate the signal, for example, you are creating an error that is predictable. Humans are very good at pattern recognition, and many can distinguish this pattern in audio.

The errors that occur are at the very bottom of the signal (LSB). Dithering blends this bit (this is where my knowledge gets fuzzy, pardon the pun, whether dithering exclusively targets the LSB) with random noise so that there is no longer a discernable pattern in the quantization errors that are inherent to bit depth reduction.

Wait a minute... Dithering is desireable because it's random? So is noise. The way I understand it, dithering schemes essentially add bandlimited noise to the signal in such a fashion as to alter the LSB (Least Significant Bit) and in some schemes, even the next to the least significant bit (NTTLSB maybe?). We're told that it is not audible, but it's still noise. Adding noise is not something that seems prudent to me unless you can give me a reason to do it. And saying that it's not audible is not sufficient reason to do so, in my opinion.

I also disagree with your assertion that truncation results in an error that is predictable. Please elaborate. I believe that truncation error is, by its very nature, unpredictable and uncorrelated from sample to sample as long as the recorder's anti-aliasing filter is properly designed.

Now, when you merely truncate a signal, I can see that this is not desireable because some of the resultant errors will be greater than 1/2 LSB in size. Here's an example:

Keep in mind that there is an implied binary point to the left of all of these numbers. 0000000000000000 is the lowest 16 bit number available and 1111111111111111 is the largest 16 bit number available. The average number is 0111111111111111 or 1000000000000000 and most schemes will encode silence to a string of samples whose values are all 0111111111111111.)

Original 24 bit sample
011010110111011011110011

Truncates to this
0110101101110110

Which is an error of
000000000000000011110011 (more than 00000000000000001, which is 1/2 LSB at 16 bits, otherwise known as the 17th bit. This error is approximately -97 dB with respect to full scale.)

If you round the 24 bit sample to a 16 bit sample instead you get this:
0110101101110111

Which is a much smaller error of
000000000000000000001101 (This error is approximately -122 dB with respect to full scale.)

If you always round instead of truncating, your error will ALWAYS be equal to or less than 1/2 LSB. My claim is that this is the way things should be done. Presumably the error in each sample in the 24 bit waveform is uncorrelated to the error in any other sample and this will be true as long as the anti-aliasing filter on the front end of the recorder is designed correctly. If you always truncate, then your error will always be between 0 and 1 LSB. Rounding will yield an expected error per sample that is -108 dB with respect to full scale, whereas truncation will yield an expected error per sample that is -102 dB with respect to full scale. When you add or subtract 0000000000000001 to or from your 16 bit sample during the dithering process, your resulting error will be between 0 and 2 LSB and the expected error per sample will be -96 dB with respect to full scale. That's like losing 6 dB of S/N with respect to what you get with simple truncation and it's like losing 12 dB of S/N with respect to what you'd get by rounding. Again, I'll admit that the noise is added in the portion of the spectrum that is least audible, but unless it provides a real benefit (and I'm not convinced that it does) I don't see the point of dithering.

BayTaynt3d · « **Reply #12 on:** April 27, 2006, 04:16:18 PM »

The problem -- as I understand it -- is that both truncation AND rounding are very predictable and mechanical inasmuch as the decimal .6 is ALWAYS rounded to 1, and .3 is ALWAYS rounded to 0. This consistent rounding or truncation when it happens in the EXACT SAME spots and the EXACT SAME way over and over again causes an audible form of harmonic distortion. When you dither with random noise, you never know exactly which way .6 is going to round, sometimes up and sometimes down, with the main point being that that SPECIFIC SPOT, or db level, does NOT behave the same way everytime it is hit.

Anyway, here's a pretty good article about dithering:
http://www.hifi-writer.com/he/dvdaudio/dither.htm

SparkE! · « **Reply #13 on:** April 27, 2006, 06:26:19 PM »

Quote from: Tainted on April 27, 2006, 04:16:18 PM

The problem -- as I understand it -- is that both truncation AND rounding are very predictable and mechanical inasmuch as the decimal .6 is ALWAYS rounded to 1, and .3 is ALWAYS rounded to 0. This consistent rounding or truncation when it happens in the EXACT SAME spots and the EXACT SAME way over and over again causes an audible form of harmonic distortion. When you dither with random noise, you never know exactly which way .6 is going to round, sometimes up and sometimes down, with the main point being that that SPECIFIC SPOT, or db level, does NOT behave the same way everytime it is hit.

Anyway, here's a pretty good article about dithering:
http://www.hifi-writer.com/he/dvdaudio/dither.htm

Yeah I've seen that article before and on the surface it looks like a pretty compelling case for dithering. But just try to follow through and duplicate the guy's results. I certainly can't. First of all, his Figure 2 shows a spectrum that's too clean to be believable. The main lobe is too narrow unless he's using a longer FFT than I have available to me (16384) and he's showing a signal that's too strong. A full scale 24 bit signal has a maximum S/N of about 144.5 dB. If you attenuate it by 60 dB, then your max S/N is about 84.5 dB, yet he shows a clean spectrum to -108 dB. Then when I get to Figure 3, he totally loses me. His words: "Oh horrible, horrible digital!". Yeah, it would be horrible if what he was showing was true. It looks to me like he used a compressor or soft clipping to get his signal down to -60 dB. When I generate a 980 Hz tone at 24 bits resolution, attenuate it by 60 dB and convert to 16 bit resolution, I get a spectrum that only looks minimally worse that the spectrum I get immediately before conversion to 16 bit resolution. I see absolutely no artifacts that could be confused with harmonic distortion where the products are at integer multiples of the fundamental frequency, 980 Hz.

The work I did was on Audacity, but I challenge you to do it on your own favorite editor. Just set your sample rate to 44100, set your resolution to 24 bits, generate a 980 Hz signal, attenuate it by 60 dB and look at the spectrum. You'll need to be careful to select an integer multiple of 45 samples since there are 45 samples per cycle at 44100 sample rate. And start your selection at the start of the tone so that the beginning of the waveform is from 0 degrees phase on the 980 Hz sinusoid. I used 737280 samples myself (which is 45 times 16384, the maximum length of FFT I have available in Audacity). Notice how pristine the resulting spectrum is. (It should be.) Then convert the waveform to 16 bits and look at the spectrum. You will see minor differences between the spectrum of the 24 bit waveform and the spectrum of the 16 bit waveform.

This is what I absolutely hate about tech articles in audiophile magazines. They may show you pictures that strongly support their case (whatever they're selling at the time), but all too often the pictures are contrived or they are pictures of something other than what they claim to be.

Brian Skalinder · « **Reply #14 on:** April 27, 2006, 06:46:38 PM »

Quote from: SparkE! on April 27, 2006, 06:26:19 PM

do it on your own favorite editor

Before you replied, I started doing this with Adobe Audition. A few differences, here's what I did:

1 kHz tone
-6 dB amplitude
32-bit (AA can't generate a 24-bit wave, only 16 or 32, but this shouldn't matter - the principle's the same)
48 kHz sample rate
10 sec tone
freq analayis with 65,536 FFT

Original 32/48 WAV

32/48 WAV truncated to 16/48 - you can clearly see the harmonics

32/48 WAV dithered to 16/48 with no noise shaping - much cleaner WAV, though you can see a bit of noise relative to the original

32/48 WAV dithered to 16/48 with medium noise shaping - the waveform remains less noisy higher up into the spectrum than with no noise shaping, and it's easy to see the shifting of noise to higher frequencies

32/48 WAV dithered to 16/48 with ultra noise shaping - again, the waveform remains less noisy higher up into the spectrum than with no noise shaping (if you look closely, more so than dither w/ medium noise shaping), and higher up than with medium noise shaping, and it's even easier to see the shifting of noise to higher frequencies

[/list]

Author Topic: 24Bit dither to 16bit vs. recording just 16bit (Read 9779 times)

bluntforcetrauma

24Bit dither to 16bit vs. recording just 16bit

tfs8271

Re: 24Bit dither to 16bit vs. recording just 16bit

nihilistic0

Re: 24Bit dither to 16bit vs. recording just 16bit

SparkE!

Re: 24Bit dither to 16bit vs. recording just 16bit

Chanher

Re: 24Bit dither to 16bit vs. recording just 16bit

SparkE!

Re: 24Bit dither to 16bit vs. recording just 16bit

mhibbs

Re: 24Bit dither to 16bit vs. recording just 16bit

SparkE!

Re: 24Bit dither to 16bit vs. recording just 16bit

MattD

Re: 24Bit dither to 16bit vs. recording just 16bit

BayTaynt3d

Re: 24Bit dither to 16bit vs. recording just 16bit

MattD

Re: 24Bit dither to 16bit vs. recording just 16bit

SparkE!

Re: 24Bit dither to 16bit vs. recording just 16bit

BayTaynt3d

Re: 24Bit dither to 16bit vs. recording just 16bit

SparkE!

Re: 24Bit dither to 16bit vs. recording just 16bit

Brian Skalinder

Re: 24Bit dither to 16bit vs. recording just 16bit