High-Resolution Audio: FAQs Part 1
In preparation of my consumer guide to “High-Resolution Audio” handout, I’m going to include a frequently asked questions section. Here’s the first part of two:
1. What is High-Resolution Audio?
High-Resolution Audio describes recordings that have been recorded and released using technologies that meet or exceed the capabilities of human hearing with regards to frequency response, dynamic range, temporal accuracy, and spatial distribution.
2. What isn’t High-Resolution Audio?
High-Resolution Audio is not old, standard-resolution analog or digital recordings – including analog tape, vinyl LPs or CDs – that have been upconverted or transferred to high specification digital formats (minimum of 88.2|96 kHz/24-bit PCM or 5.6 MHz DSD). The fidelity of old master remains at the fidelity of the originals…standard-resolution.
3. What are the differences between existing music formats and High-Resolution Audio?
Lossy compressed file formats like MP3, AAC, and OGG discard some elements of the music, which results in smaller files with lower fidelity. CDs don’t lose any information and can sound terrific but are unable to meet the capabilities of human hearing. High-Resolution Audio raises the bar to a level that offers artists, engineers, producers, and labels the potential to provide “real world” fidelity to consumers for the first time. There is increased dynamic range, extended frequency range, more accurate timing, and and better spatial distribution when compared to lossy MP3s and even compact discs..
4. How do the major labels and industry organizations define High-Resolution Audio?
Their definition states, High-Resolution Audio is defined as “lossless audio that is capable of reproducing the full range of sound from recordings that have been mastered from better than CD quality music sources.”
They also discuss “master quality descriptors” that describe the types of formats that can be used as sources for high-res audio. One of them is MQ-A. It allows any analog source from any era to be used as a source for High-Resolution Audio. Another is MQ-C. Despite the prohibition stated above, compact discs can also be used as a source for HRA. How can something that is supposed to be “better than CD quality” be supplied by a CD? This definition is inaccurate, confusing, self-contradictory, and has been widely criticized.
The labels and organizations would prefer to have hardware companies, digital music stores, and streaming services adopt a definition for High-Resolution Audio that includes virtually ALL recordings ever made regardless of the fidelity of the files as long as they are transferred to large digital files. Consumers and media outlets have failed to understand the critically important sequential link between source fidelity and the specification of the delivery file. Higher specifications don’t elevate the fidelity of the original.
5. Will I be able to hear any differences between what I’ve been listening to and the new High-Resolution Audio format?
The sonic differences between what you’re used to listening to and bona fide High-Resolution Audio files is subtle and difficult to hear unless you have a high-end playback system and have been trained to hear the additional fidelity.
However, many of the “so-called” High-Resolution Audio files offered for sale are derived from standard-resolution analog audio sources and don’t benefit from the higher specifications. They represent the highest fidelity currently available – Master Source Quality – but fail to meet the higher requirements of real High-Resolution Audio.
The high-resolution digital download sites are supplied new transfers from analog tape to high spec digital files. Sometimes they are remastered and other times they are simply transferred.
38 thoughts on “High-Resolution Audio: FAQs Part 1”
Can you please elaborate on “accurate timing, and spatial distribution”.
There is increasing evidence that the timing of audio signals must be kept to within 5 or 10 micro seconds. This means a sample rate of 96 kHz. It also contributes to accurate spatial distribution of directional information when listening to stereo or surround sound.
Can you please post a link for reference. Thanks.
I’m sorry, a link to what item?
I remember reading that the theoretical time resolution of PCM is 1 / ( pi × sample-rate × number-of-levels), which, for Red-book, comes out at less than 1 nano-second!
There was a simple experiment performed here: http://forums.stevehoffman.tv/threads/time-resolution-of-red-book-45ns.85436/
that suggests that it is indeed the order of nano-seconds rather than micro-seconds.
The research that Robert Stuart published at the most recent AES convention gives 5-10 microseconds.
We may need to define here what we are looking for:
A burst of noise from a silence back to silence is probably perceived when much shorter than a musical track lacking perfect synchronization (or in phase signals). I would also appreciate to learn more.
I’m looking in to the whole timing issue…I’ll let you know.
Yes, but Robert Stuart’s statement (that human hearing can be sensitive to timing differences down to 5 µs) is not what’s being questioned.
The question is what is the minimum timing difference that can be resolved by CD? It’s not simply a function of the sample-rate; the bit-depth plays a factor too, and mathematics says the answer is < 1 nano-second.
There’s a useful video that supports the theory here: https://xiph.org/video/vid2.shtml
At 20:52, it shows smoothly varying the timing of a signal at CD rate in between the nominal sampling-points.
I’m not sure we’re on the same page or addressing the same issue. I haven’t watched the xiph video in a while but have some issues with some of the information I’ve seen in those videos (in general they’re quite good).
How does your question about the minimum timing difference that can be resolved by a CD relate to spatial accuracy in the capture of sound?
Sorry for the confusion Mark, I was commenting only on the timing aspects. Above, you wrote “There is increasing evidence that the timing of audio signals must be kept to within 5 or 10 micro seconds. This means a sample rate of 96 kHz”. The various things I mentioned in reply were pointing out that “This means a sample rate of 96 kHz” may not be the case in practice.
I.e. it seems that CD rate can accurately reproduce timing well below 23 µs (= 1 / 44100Hz), down to nano-seconds in fact, easily meeting Robert Stuart’s figure of 5 or 10 micro-seconds.
I’ll have to do some research on this.
Robert Stuart’s AES article introducing MQA gives a long list of references relative to perception of timing differences. He provided a partial list in a response to a comment on a Stereophile post – http://www.stereophile.com/comment/549930#comment-549930. Mark has a copy of the paper, and could post the full list, but many of them require a subscription to the journal in which they were published.
Here are a few related papers that are freely available, and are written to be easily understood by non-technical readers: http://boson.physics.sc.edu/~kunchur/Acoustics-papers.htm
I wouldn’t want to violate the copyright requirements of the AES. If you’re interested the paper is available at their site and is not expensive.
Posting Mr. Stuart’s paper would certainly be a violation – and I wasn’t suggesting that –, but posting a list of the relevant references would not be. In any case, his Stereophile comment gives a good place to start.
When I get back to Los Angeles, I’ll see what I can share.
Mark, How do microphones fit into the provenance equation? Some of the older ones sounded quite different to me but perhaps there is a new generation that is highly accurate.
Many thanks for all your contributions.
Microphones are like lens on cameras. They offer a creative individual choice between color and accuracy…sometime there’s call for a specific color and other times you want an analytical mic.
I suppose my real question has to do with the ability of microphones to capture the performance. Perhaps this is not the right way to address the issue but can you relate microphone accuracy to bit depth/sampling rate or some other measure of accuracy? Do today’s microphones compromise the ability to produce a hi-res recording?
Microphones are the source transducer in the production of an audio recording. And they come in all shapes, sizes, sensitivities, qualities, colors (sonic timbres), and price ranges. There are prized vintage microphones, recreations of older designs, calibration microphones, digital microphones, stereo mic, etc. They are all able to capture musical sounds and convert them to electrical signals. To qualify for high-resolution status they should be able to handle dynamics up to 130 + dB SPL and provide a frequency range up to 40 kHz. Not all of them meet these specs…but many do. But it also depends on how accurately they accomplish their conversion.
I am fortunate to own a number of vintage Neumann, AKG, and B&K microphones. They are not calibration microphones but that doesn’t mean they suddenly stop working at 20 kHz. When I look at the spectra produced by these condensor mics, there is lots of ultrasonic information. And I get dynamics well beyond 90 dB.
So yes, today’s microphones are producing sounds that eclipse the capabilities of CDs and analog tape.
difficult to hear unless you have a high-end playback system and have been trained to hear the additional fidelity.
Therein lies the rub… and moving up the scale in audio equipment in $10,000 step levels the noticeable increase in sound quality is even negligible, or not worth the money spent. And that is not including high-end shysters who market with pumped up prices and outrageous claims in fidelity and falsified data.
HDA playback can be achieved quite reasonably with any number of today’s HD portable players or your computer plus a under $500 USB DAC, and a under $500 set of headphones.
A stereo with speakers can get a little pricey. LOL
Now if you can actually HEAR the difference is another bag of worms altogether. 🙂
Mark, you repeated “a definition a definition” in the second paragraph of FAQ 4.
On the same subject, namely the ultimate in sound, do you know the wattage required to “naturally” or should I say “accurately” portray the faint sound of a pin dropping on a floor? Minus any distortion, compression etc. It is in the thousands of watts. As we all know wattage isn’t about loudness, but accurate pure megapower is about rendering tonal accuracy, natural fidelity, in short, reality, or as close as possible to live. And ironically most concerts are electric, unlike a symphony, which is a whole different story. I read this somewhere years and years ago, but that was back in analog record days. I do not know if it still applies to digital. Anyone remember this topic? And analog records just won’t go away… http://www.oregonlive.com/music/index.ssf/2014/11/does_vinyl_really_sound_better.html
One more thing about vinyl… It’s that noise and rumble when the needle hits the turntable that is the background wall of warmth these vinyl-phobes are addicted to. When it is mixing in with the music it masks the harsh highs and any harmonic distortion or imperfections much like a filter. No science, just facts and common sense. IMHO
Because I support the idea of a more stringent definition of high resolution audio, I feel compelled to point out some issues in this FAQ.
3. “CDs don’t lose any information…” relative to what? The CD standard of 16/44.1 loses the high frequencies present in most music and captured by decent microphones, and may require compression or truncation of dynamics from the original performance. This is one of the better arguments for high-resolution audio, and shouldn’t be lost through sloppy phrasing.
4. With the different master quality designations — including from CD sources — you can call the definition confusing and contradictory, but you can’t call it inaccurate simply because it disagrees with your own, personal, definition. To the extent that there is an accepted definition of high-resolution audio in the North American market, this is it. If you don’t like it, you can try to get it changed, but what authority can you cite for your claim that it is inaccurate?
More broadly, if your definition requires frequency response above 20 kHz, that’s easy. The trouble is dynamic range, for which, I think, you have previously stated a figure of 130 dB. That’s nearly impossible to achieve in any musical context. For example, the DPA 4006 has an equivalent noise level of 15 dB (A), which would require sound pressure peaks of 145 dB. Is a recording with less than 130 dB of dynamic range not high-resolution by your definition?
You say : High-Resolution Audio raises the bar to a level that offers artists, engineers, producers, and labels the potential to provide “real world” fidelity… Would it better convey the message if you were to say : to provide “the same fidelity that the artist hears in the studio”?
I like the real world…but the sounds that come through the studio speakers might already be compromised.
In your FAQ’s you need to explain provenance and that any recording made prior to a certain date could not be considered HR. There also needs to be a discussion of human hearing limits and how that translates to electronic measurements and digital limits e.g. 1% THD is the just noticeable difference in a human ear. Guidance about what to look for in playback equipment would also be helpful.
Good suggestions, thanks.
“High-Resolution Audio … that meet or exceed the capabilities of human hearing …”
“CDs … are unable to meet the capabilities of human hearing.”
Here’s the part I don’t understand: I personally can’t hear above 13 kHz (I’m 53), and most adults can’t hear above, oh, say, 18 kHz. So why isn’t 44,100 samples per second enough (22 kHz)?
Similarly, 16-bit samples provide 96 dB of signal to noise. I don’t think there are any recordings that use this much range, nor is anyone capable of hearing so little noise.
So if your definition is “meeting the capabilities of human hearing”, it seems that CD qualifies.
As Ran suggested, I think you need to back these claims about the benefits higher-than-CD quality with scientific data.
I’m personally not convinced that as a delivery medium 96/24 has any demonstrable benefits.
Certainly the equipment to render better than 96 dB and 22 kHz would cost more than most of us can afford, and would have little, if any, additional benefit over equipment only capable of 96 dB and 22 kHz.
Wouldn’t it be better to focus your time and energy on provenance and the recording process, and just forget about promoting 96/24 as a delivery format?
Also focus on multi-channel.
These are things that make a real, noticeable difference–quality recording engineering and multi-channel. 96/24 doesn’t make a difference. The entire “hi-res audio” thing should just go away. It’s snake oil.
Just my opinion.
CDs can be wonderful and probably come close enough to meeting the capabilities of human hearing that it doesn’t matter. But I would rather have specs that definitely make it…even though, as you rightly point out, plenty of us won’t be able to tell the difference. Having a sample rate of 96 kHz does improve a number of things (easier filtering, lower noise, and the extra octave of frequency response…there is sound up there).
And it’s not expensive to acquire a system capable of meeting real resolution specifications. All things being equal, audio production does benefit from having 96 kHz/24-bits available during production. I’m happier knowing that everything that came in the microphones was delivered through the speakers.
My focus is on better recordings and multichannel…but it’s too easy to adopt 96/24 as the new normal.
I continue to be confused why the sample rate of 96khz is sufficient, if the timing of audio signals must be kept to within 5-10 micro seconds. Does this not mean a minimum of 192khz to reach the 5.
I think I have have read that mathematics can be used to achieve the desired result at 96khz, but perhaps 192khz is better is order to avoid the mathematics?
The theoretical need would point to 192 or even higher. I don’t really think that the audio suffers at 96 kHz.
Hi Mark, seeing as the mainstream press still often use the terms ‘high fidelity’ or ‘high definition’, would it be worth a word or two clarifying the difference?
High fidelity is less specific. It just refers to the world of better quality audio than the mainstream. “High-Definition” or better High-Resolution takes on specific requirements.