Yesterday’s post about a couple of published charts prompted a few comments and a couple of private emails of concern about my recasting of the “resolution bars” from the HTR chart. Many of you have seen the spectrograms that I’ve posted on this site and understand a few basic things about formats and fidelity expectations. I thought I would provide a brief refresher for those seeking to understand the differences between the actual specifications, the potential fidelity of a particular specification and a simple bar chart like those so commonly put on the web as “infographics”.
Bar charts try to use the numbers associated with a format to impress. The bigger the numbers the better, right? I read yesterday about the new Nagra DAC that was introduced at this week’s Munich Audio Show. They and several other DAC makers are pushing 384 kHz as a reasonable sample rate for converting audio to PCM and back (Exasound, I think was the first company to move into the upper stratosphere of 384 kHz…of was it Light Harmonic). There is absolutely not reason to use 384 kHz as a sampling rate…I’m content with 96 kHz.
“Potential” fidelity is a concept that is actually quite complex. We’ve talked about the fidelity of the formats, the fidelity of the original mixes and the fidelity of the final mastered music. And don’t forget about the fidelity of the hardware, cables, speakers etc. But if we strip things down to the simplest and primary components of audio fidelity it comes down to dynamic range and frequency response. I think of this potential fidelity as a two-dimensional box with frequency response along the x-axis and the dynamic range along the y-axis.
Take a look at the figure below. This is the basic graphic used by Adobe’s Audition program.
Figure 1 – The graphics used by Adobe Audition to view frequency vs. dynamic range. [Click to enlarge]
So each of the bars in yesterday’s HTR charts consisted of an encoding type AND some typical parameters associated with those formats. There were MP3 files, LPCM (usually just called PCM) and DSD AND they ranged from 64 kbps, to 1411 kbps (for CDs) to around 4000 kbps for 96/24 and so on. The person behind the chart then slipped into associated relative “qualitative” measures to these numbers. This is where the provenance of the tracks matters. If an analog tape from 1958 is transferred to a DSD 64 file…it can’t possibly measure up to a new recording at DSD 64 or better yet, a 192 kHz/24-bit PCM track.
So I took some time this morning to recast the “relative resolution” bars of the HTR chart to “potential fidelity boxes” on the Audition grid. Take a look at Figure 2 below:
Figure 2 – The “potential fidelity” of various audio formats and specification levels. [Click to enlarge]
I believe this is a more accurate representation of the information about formats. It certainly works a lot better than simply multiplying the numbers and ranking things based of magnitude. The article on the new Nagra fell into that trap by saying the DSD 128 capability of the new unit is “256 times better than a CD”, which casual audiophiles might believe unless they know that how much real world frequency dynamic range is associated with DSD 128 (I’m not even going to start talking about the fact that virtually no one is making native recordings at that rate.
And please don’t forget to notice the “white” stroked area that represents what human hearing can handle. I put a gradient past 20 kHz to around 40 kHz…and I get plenty of push back on that. So why are we talking about formats that give us 196 kHz?
Think of each of the boxes in the diagram as the area in which sound can exist within each of the formats. Of course, finding recordings that fill up the potential space is near impossible…and perhaps it doesn’t matter.