Originally Posted by airisom2
Alright, I think I'm getting this now. So basically, it outputs all 24-bits, but because of the noise floor, some of the extra bits from the file gets absorbed by it. I was thinking that the maximum SNR was set in stone, and you couldn't go above it. So, let's take the Xonar STX for instance. It has a 124dB SNR on the line-out port, and a 24-bit file needs a ~144dB SNR in order to to be correctly reproduced. From what I've read, for every bit you have, the SNR increases by 6dB. Do the math, and 3.333->bits are being absorbed by the DAC's inadequate noise floor, so you're really only listening to a ~21-bit file. I guess my question now is how the bit depth correlates to what your hearing, and what bits in a file corresponds to what. Is it like network packets to where you have some identifying bits, and the other data bits, or is it something totally different?
Raw audio data is almost always Linear Pulse-Code Modulation. 24 bits of 1s and 0s indicate some level. There's one bit for the sign (positive or negative) and the rest are in decreasing importance. Music files have some identifying headers and so on, that aren't part of the raw data. But if you look in like a .wav file, you'll see that kind of information. In compressed audio, it's stored a more efficient way.
01011101 11011011 is close to
01011101 11011000 but quite different than
Except in extreme scenarios, even 16 bits is pretty overkill. Most recordings have noise from the mics / environment that's over the 16th bit anyway. There's a demo in this video:
video (Click to show)
with reduced bit depth, starting from 16 bits and going own. Check the audio files here (the YT is lossy of course):
And here's another thing I don't understand (I think). By following the Nyquist theory, the peak frequency that is created is roughly half the sample rate. So, if you have a 44.1k file, the frequency range will peak at ~22k. You apply that to a 96k file, and it jumps up to 48k. The average human ear can only hear up to around 20k, so what's the point of sampling higher than the threshold of human hearing if we can't hear the differences? It kinda reminds me of some of the headphones reviews I read. Some are rated from like 15Hz to 41KHz, yet we can't necessarily hear those frequencies. People usually say that because the frequency range is more extended at either end, it gives the perceptible sound range greater clarity. Is that what's basically going on with a 96+ sample rate? That sounds more subjective than anything.
It's not the peak frequency that is created, but the highest frequency that can be mathematically represented. The frequency content is a result of whatever is making the sound. You can sample at 4 MHz a bass playing—you just won't get any supersonic frequency content out of it (supposing we have imaginary mics and an ADC that can record that high).
Average young person's ear might detect 20 kHz, and that's with a relatively loud sound and probably with nothing else playing to mask it. Average person falls well short of that.
Because of the limits of hardware ADC performance (sharper low-pass filters are harder to do well), maybe so you can avoid a sample rate conversion later on, and for some small practical concerns, it could be better to sample at higher than 44.1 kHz, which already allows for a bit over 20 kHz of audio. Not much benefit over say 50-60 kHz or so (and even then, for all practical purposes these days, 44.1 kHz is fine), but the lowest commonly-seen rates are 88.2 kHz and 96 kHz. Harmless overkill, mostly, except that it means larger file sizes.
Or maybe you're recording stuff for analysis or non-human consumption.
Another question that I have is how the bit depth relates to the frequencies being played. Here's my stand on it: Having a 16-bit file is like looking at a 20-band frequency spectrum (the visualization that looks like bar graphs), while a 24-bit one is like looking at a 80-band frequency spectrum. Both are playing the same thing, but the 24-bit file has greater levels of resolution that allows for a greater sense of separation between the frequency levels. With that said, do 16-bit files, although being able to reproduce the 20Hz to 20KHz frequency spectrum, have some grey areas in it? For instance, a 16-bit file may be able to reproduce a sound at 12,945Hz, but will it be able to reproduce a sound at 12,945.3463456Hz? Is that where the benefits of a 24-bit file come into play? Then again, no DAC is able to reproduce all 24 bits because of circuitry limitations, so is that sense of separation diluted even further (separation/sonic difference between 16bit and 24bit and the separation between precise frequencies)?
The ability to separate frequencies is a result of the sampling rate, not the bit depth. If it's under 1/2 the sampling rate, it can be represented. (Too close to exactly 1/2, and practical real-world hardware may have issues / cut it off.) Any frequency under it can be represented: 12,945Hz or 12,945.3463456Hz or whatever, no matter the bit depth. Yes, and the phase correctly as well.
Okay, you might ask where all that information is coming from? How can you tell those frequencies apart? It's from the number of samples. Suppose you had an infinite (or a whole lot) of samples of each. Eventually you'd see that one of them is bobbing up and down a little more frequently than the other. (That's the non-calculus, non-systems theory response.)
Here's a video on digital audio fundamentals, may be worth a look for you:
http://xiph.org/video/vid2.shtmlEdited by mikeaj - 4/23/13 at 6:32pm