Dr. AIX's POSTS — 24 December 2014


Continuation from yesterday…

He continues with a discussion about the midrange improvements that high-resolution audio was supposed to offer. He states, “it was not in any way dependent on extended dynamic range.” After more than 7 years, he’s telling us that the study was really designed to test the audibility of “musical material in the upper 40 dB of the signal space, when played at normal levels.” Why didn’t they put that in the abstract of their paper? Why didn’t they limit their research and subsequent paper to the audibility of midrange attributes in hi-res vs. standard res? They should have limited their research to things like true timbres, consistent spatial imaging, and reverb integrity rather than conclude that there is no audible difference between high-resolution and standard resolution audio.

I guess I’m one of the “scornful people” that is arguing that high-resolution audio presents differences because of the increase in dynamic range and the potential for ultrasonic frequencies. I think CDs do a great job with timbre, imaging, and reverberation…and the midrange sounds just fine.

He’s right that real high-resolution recordings are rare. “In other words, there are very few discs on which one would expect any audible difference at all; none where the difference is easy to hear; and none where it is audible at normal playback levels. You have accepted the science behind all this! Welcome to reality. It appears our paper has had more influence than we could have dared to hope.”

They should have included some of these rare, real high-resolution discs but were satisfied with the likes of “Goodbye Yellow Brick Road” as a real high-resolution album. They tested standard resolution audio discs that didn’t contain the attributes that they were looking to find. The research was flawed and should be ignored.

Mr. Moran is satisfied that humans are unaffected by sounds above 20 kHz. “Everyone who argues that these are important has taken the research and its conclusions about human perception that were well established forty years ago, and thrown them out the window.” I don’t believe a proper study has been done that unambiguously establishes this one way or the other (the new Stuart/Craven study looks promising). But given that 96 kHz/24-bit has engineering benefits, doesn’t cost anymore to implement, and provides a more accurate reproduction of the original music event, I’m going to err on the side of including ultrasonics. Why wouldn’t everyone?

The most interesting part of the Moran statement is the final paragraph. After having dismissed the importance of increased dynamic range in a previous paragraph, he points out that “there is one disc whose dynamic range exceeds the CD’s limit: The Hartke recording on Hilliard. When I discovered the properties of this disc, I turned the system way up and conducted a test using only the initial fade-in of the room sound. The difference was easily (and of course provably) audible. This is all in the paper, so, as has so often happened, those arguing that we didn’t use such a recording /have not read what we wrote.”

Does this mean that they should have used dynamic range as a metric for hi-res audio? Perhaps so.


I’m still looking to raise the $3700 needed to fund a booth at the 2015 International CES. I’ve received some very generous contributions but still need to raise additional funds (I’ve received about $3400 so far). Please consider contributing any amount. I write these posts everyday in the hopes that readers will benefit from my network, knowledge and experience. I hope you consider them worth a few dollars. You can get additional information at my post of December 2, 2014. Thanks.

Forward this post to a friend and help us spread the word about HD-Audio Forward this post to a friend and help us spread the word about HD-Audio


About Author


Mark Waldrep, aka Dr. AIX, has been producing and engineering music for over 40 years. He learned electronics as a teenager from his HAM radio father while learning to play the guitar. Mark received the first doctorate in music composition from UCLA in 1986 for a "binaural" electronic music composition. Other advanced degrees include an MS in computer science, an MFA/MA in music, BM in music and a BA in art. As an engineer and producer, Mark has worked on projects for the Rolling Stones, 311, Tool, KISS, Blink 182, Blues Traveler, Britney Spears, the San Francisco Symphony, The Dover Quartet, Willie Nelson, Paul Williams, The Allman Brothers, Bad Company and many more. Dr. Waldrep has been an innovator when it comes to multimedia and music. He created the first enhanced CDs in the 90s, the first DVD-Videos released in the U.S., the first web-connected DVD, the first DVD-Audio title, the first music Blu-ray disc and the first 3D Music Album. Additionally, he launched the first High Definition Music Download site in 2007 called iTrax.com. A frequency speaker at audio events, author of numerous articles, Dr. Waldrep is currently writing a book on the production and reproduction of high-end music called, "High-End Audio: A Practical Guide to Production and Playback". The book should be completed in the fall of 2013.

(24) Readers Comments

  1. Mark wrote “Does this mean that they should have used dynamic range as a metric for hi-res audio? Perhaps so.”

    But then 90% or more of audiphiles won’t be able to tell the difference between a CD using full 16 bit and any other format using the full 24 bits. Those who’s equipment has the capability and has been used at a level to realise the full range will most likely now be partially or fully deaf.

    • I find the dynamic range to be a very perceivable attribute of high-resolution audio…or maybe it’s just the lack of compression and limiting. If you have a system cable of great dynamic range, it’s not going to smash anyone’s ears unless they are listening to heavily mastered audio day after day.

  2. I really don’t care what particular aspect of improvement listeners hear or don’t hear.
    What I like to see is a definitive test using first class HD recordings and the best playback, everything we talk about here that is needed to hear the superior sound of HD.
    Then we’ll either have an outcome of 85 or more out of 100 correct responses proving the audibility of HD, or we’ll get something closer to 50/50, indicating no better than a guess.
    Time we had some double blind tests using equipment and processes that can’t be contested by any reasonable person.

    • I’m hoping to pull this off this year.

  3. If we assume a noise floor in the listening room of 30 dB SPL, then the peaks would have to be above 120 dB SPL in order for the room’s noise floor not to mask what information might exist in the additional 8 bits, right? That’s extraordinarily loud for a domestic hi-fi system. The arguments for recording and mixing with at least 24 bits, on the other hand, are much more obvious.

    • It is loud and would require a really first class system…but not out of reach. And transients at 120 SPL wouldn’t be a problem either.

      • I don’t believe there are many hi-fi systems that can deliver real-world dynamics; and for those that can how many recordings exploit the full dynamic range of 16 or 24 bits; and of those recordings how many capture the full range of, say, a drum kit. If I had a system capable of reproducing real-world dynamics, and reproduced said well recorded drum kit at accurate levels, I would be in danger of damaging my hearing.

        • You’re right they’re aren’t very many and even fewer discs that deliver the full range of a drum kit. Thankfully, we don’t listen to drums a maximum level…but the rim shots and loud passages can be dealt with using current state-of-the-art systems.

        • Dave I agree with you. By the time a system can pop out 120db peaks in a domestic living room, your ears will be cooked. As stated, CD dynamics already exceed the vast majority of playback systems. Hi-Res (should) be about full harmonic retrieval, low level linearity, ambience, natural sonic textures, and time domain performance, the areas where CD falls down and vinyl excels. Most homes have surprisingly high ambient noise floors, and 105db max is plenty even for the Rolling Stones, in a domestic environment. It’s about quality/purity of reproduction, not a car db shootout. Believe it or not, many studio guys have suffered threshold shift due to high monitoring levels, and many studio monitors that play big suck in other respects.The whole idea of “translating” is such backwards b.s. If you make a recording to sound excellent on an honest, high-end stereo, it will sound good on anything; you can have my house if that’s not true. Happy New Year!

          • The vast majority of sound coming from a live player…a drummer or percussionist or amplified guitarist…is far below the transient peaks. The recorded peaks should be reproduced through the playback system. You can set your level anywhere you want…but getting real world dynamics is a big plus for those of us that want it. I wouldn’t want to listen to the Rolling Stones at 105 dB because that would be the RMS value not peaks…since there are not peaks in a heavily mastered recording.

    • My Klipsch LA Scala’s would do 105 SPL at 1 watt, and 121 spl max. And can do it at unheard of low distortion levels. SOTA horns do rule.

      • Yes, but you need the original recording to have the full dynamic range to exploit that. See above posts.

        • I agree…you need the original recordings AND they must have the session dynamics in tact.

          • What’s the maximum difference you found in your, or anyone else’s, recordings between the peak and average level of the track? I haven’t come across any tracks where setting the maximum level to 120 dB SPL wouldn’t mean the average level was horrendously loud. I’m not saying it’s impossible to generate and capture such sounds, just that I don’t find them in real recordings, and I primarily listen to classical music, which tends to have a wide dynamic range. If you do have some examples, I think that would be worth a post.

          • I was faulted in TAS magazine for my Bolero recording having too much dynamic range. By avoiding any dynamic compression, I provided customers with a real world contour of the piece. Classical music (the Great Gates of Kiev, the Pines of Rome and Rite of Spring all contain passages that exceed 90 dB. The LA Philharmonic places baffles around the heads of some woodwind players to lower the level of the brass and percussion. Of course, from the audience the inverse square rule reigns but on stage those level do happen.

      • Yes, but only if you like listening to your speakers more than listening to the music. When La Scala’s were designed, the word ‘transparent’ was virtually never applied to audio playback. It’s a tough choice, I agree, but audio is all about trade-offs and the balancing act. For example,little Kef LS-50’s would fly away trying to play dynamically, but at normal levels they would embarrass the big horns for natural voice quality, tonal balance, detail, and dimensionality. One of my earliest lessons in hi-fi is that a great stereo is not a glorified P.A. system.
        One is a sonic billboard, the other is a digital photo. Thanks for hearing me out.

        • So, for example, what is the difference between the peak level at the end and the average level in the opening of your recording of Bolero?

          An interesting point, quite apart from the measurements, is that once the music starts getting very loud, your ears will introduce their own compression – see http://en.wikipedia.org/wiki/Acoustic_reflex. I don’t think anyone ever studied the effects of the acoustic reflex on the high-end audio listening experience. Could be interesting…

          • The dynamic range indicated in Audition is 99 dB. You’re right the ears do their own compression for very loud sounds.

  4. “But given that 96 kHz/24-bit has engineering benefits, doesn’t cost anymore to implement, and provides a more accurate reproduction of the original music event, I’m going to err on the side of including ultrasonics. Why wouldn’t everyone?”

    Amen Mark, I’m with you on that, and I will hang my head and publicly admit that I reported hearing no differences between any of the files. But my system is not really HD quality and my 65 year old ears have been subjected to large amounts of unprotected armor, mortar, and rifle file as an infantryman in Vietman, and also an avid shooting sportsman for the past 40 years.

    But I do believe under the best circumstances that HD recordings will be audible. And as you say with the cost of HD producing falling to a level about the same as analog or 16/24, it would be silly not to use the best tools available to us. My hope is that soon HD recordings “96/24” will become the norm and retail prices will fall to the level now being charged for the inferior products today.

    • What does that dynamic range indicator in Audition really mean? If it’s the highest peak versus the lowest valley, it’s not all that helpful. For example, you could be recording a rock band in a quiet room. So long as the recording included some time they weren’t playing, you’d end up with a huge reported dynamic range. At the other extreme, comparing the peak value to the average across the entire track – as is done for the tracks in the DR database – doesn’t make sense for tracks with different averages in different sections, as we find in pretty much all classical music. Measuring the average level in the quieter passages – though still when there is stuff going on –, and comparing it to the track’s peak could help us better judge the usable dynamic range. For example, if the first statement of the theme is 50 dB below the absolute peak in the track, and we want to play that back such that the opening theme is around 60 dB SPL in the listening chair, then the noise floor of a 16-bit system will still be below the probable noise floor of the listening room.

      I’m happy to use the additional storage space for 24 bits, even if they aren’t absolutely necessary, but I would like to know if there any recordings that would really necessitate the extra bits.

      • Andrea, you’re right about including the fade in and fade out to silence when doing the dynamic testing in Audition. That’s why I allows scan the selection from the start of signal to the end of signal…this gives an accurate impression of the dynamic range. If the DR database is doing what you’ve described that’s a false reading.

        • Measuring from start of signal to end of signal does give a useful measure of dynamic range. However, going back to my rock band example, if they rest for one beat in the middle of an otherwise huge wall of sound, you would still get a very high reading. I think the method I described in my previous post would give a better indication of how the dynamics of the recording would relate to the experience of playing it back. The bottom of your 99 dB figure is still just the background sound of the hall. You wouldn’t, necessarily, set that as equivalent to the 30 dB or 40 dB SPL noise floor of the listening room, so the peak wouldn’t be 99 dB higher. Given your recording methods, something like the Bolero is likely to be at the upper end of what we would see, if we could examine all of the high-resolution recordings out there. That’s why I am interested.

          Thanks for your willingness to engage in this dialogue.

          • I have to chuckle when you mention the rock band and dynamic range…honestly, how many rock tunes can you name that have meaningful breaks in them. Whenever I take a look at a waveform and analyze it, it’s necessary to use some judgement about where to scan the files.

            As for my Bolero recording, it was criticized by a reviewer for The Absolute Sound because it had “too much dynamic range”. I couldn’t believe it! I captured the natural dynamics of the performance in Bucharest, left the natural dynamics in tact, and delivered that sound on the DVD-Audio that I released…and that was somehow a mistake in their mind. I believe that leaving the natural dynamic contours of a selection of music is appropriate for certain types of music. Using high-resolution recording techniques and a purist post production approach makes it possible to record and deliver natural dynamics for the first time. Whatever your own playback circumstances are (at home or on the road) isn’t my concern. It is much easier to compress the dynamics to suit your situation than to leave the compression to me.

  5. I entirely support your position that recordings of classical music should be done with zero compression and minimal postprocessing.

    For those readers who might be interested, I did the analysis I described on the Reference Recordings recording of Pines of Rome which I had downloaded in 24/88.2 from HDtracks, since RR is also well-known for wide dynamic range. The average level of the clarinet solo is about 50 dB below the track’s peak. (The peak to valley difference that I believe Audition would report is much higher.)

    To put that in context, if you were to playback that clarinet solo at 55 dB SPL – more or less how you would hear it in the concert hall –, the peak would be 105 dB. The noise floor of a 16-bit system would be below the noise floor of pretty much any listening room.

    As I stated previously, I’m okay with the storage penalty for 24-bit, but I’d be interested to find a recording that makes a solid case for better than 16-bit in a delivery format.

Leave a Reply to Dave Griffin Cancel reply

Your email address will not be published. Required fields are marked *

four × 5 =