Dr. AIX

Mark Waldrep, aka Dr. AIX, has been producing and engineering music for over 40 years. He learned electronics as a teenager from his HAM radio father while learning to play the guitar. Mark received the first doctorate in music composition from UCLA in 1986 for a "binaural" electronic music composition. Other advanced degrees include an MS in computer science, an MFA/MA in music, BM in music and a BA in art. As an engineer and producer, Mark has worked on projects for the Rolling Stones, 311, Tool, KISS, Blink 182, Blues Traveler, Britney Spears, the San Francisco Symphony, The Dover Quartet, Willie Nelson, Paul Williams, The Allman Brothers, Bad Company and many more. Dr. Waldrep has been an innovator when it comes to multimedia and music. He created the first enhanced CDs in the 90s, the first DVD-Videos released in the U.S., the first web-connected DVD, the first DVD-Audio title, the first music Blu-ray disc and the first 3D Music Album. Additionally, he launched the first High Definition Music Download site in 2007 called iTrax.com. A frequency speaker at audio events, author of numerous articles, Dr. Waldrep is currently writing a book on the production and reproduction of high-end music called, "High-End Audio: A Practical Guide to Production and Playback". The book should be completed in the fall of 2013.

29 thoughts on “Audiophile Midterm Answers

  • stewart kiritz

    <>

    I’m wondering about this assertion regarding SACDs. To me there is an enhancement. Are you saying this is just a result of my expectations and that SACDs do not have any greater fidelity than ordinary CDs?

    Reply
    • Yep…SACDs are standard fidelity…not much better than CDs in terms of specs.

      Reply
    • Also, when an album is re-released as an SACD, it often is remastered in the process. So on the few SACD I own, I attribute their superior sound almost exclusively to the improved ADC and mastering choices of the engineer. And on the flip side I have listened to SACD that sound worse than the CD release, again personal preference in mastering.

      Reply
  • First, I repeat that this test is my favorite article you’ve posted yet, among many good articles. The questions revealed deep insight and thoughtfulness. Outstanding!

    That stated, I only got #9 incorrect according to the answer key and as is my rep from my school years many years ago, I’m going to state my case for my answer 🙂

    I selected “b”, and the answer key says “e”. I think what drove me to what I selected was the notion that noise-shaped dither (which is available in any competent sampling app these days) is specifically designed to avoid equally spreading digitization errors across the audible frequency range, and specifically designed to push such errors above the audible range (or at least to less audible areas per Fletcher-Munson curves) as much as possible. As a result of using noise-shaped dither, dynamic range issues can be avoided and in many cases dynamic range can be enhanced relative to not using dither across much of the audible frequency range when resampling.

    But I quibble – this quiz really demonstrated your depth of knowledge of many of the keys to high quality audio. Great job!

    Reply
  • Alex S

    I enjoyed this a lot. I’m ashamed to say I got a B, missing questions 5 and 8. Especially 8..after we just finished going over this….I’d like to have a test again in a few months.

    Reply
  • Wayne Smith

    I’m trying to understand Q4. “The mastering process used on commercial release reduces the overall dynamic range of virtually all CDs to less than 10-12 bits or less than 4-bits.”
    Was that meant to read to less than 10-12db or less than 4-bits?

    Reply
    • You’re right…I mistyped

      Reply
  • Dave Griffin

    This is the problem with the current education system, you pre-suppose that all answers would be from using analytical judgement. A modern student would simply search your history on the “web” to get all the “correct” answers and thus an “A”.

    Reply
  • Kenrick Harry

    Professor,

    My score was a “D”, 6 correct out of 10. I had a lot of fun doing the excersize and look forward to more of the same in the near future.

    Reply
  • Got an A – was a busy bee :-))

    Reply
  • Édouard Trépanier

    Thank you for the opportunity to qualify my “not so good” result. So, with all due respect Professor:

    2. Yes DSD 64 will roll off high frequencies to cut off noise and this may or may not be good enough for HRA. We will see how HRA is defined on the market. However, the high frequency potential and mostly the dynamic range (including SNR) is better than CDs potential. This being said, I would have all my music UHRA (PCM 196/24) if I could.

    4. I guess this one is a push back. The question is about capturing the music from the band. That is the « pick up » stage. Whatever goes out on the market, an audio engineer always seeks for the best dry master. If I had the chance to record « Babe I’m Gonna Leave You » with Led Zeppelin, I would want 24 bits: 18 for Mr. Page’s acoustic guitar, a few more bits for Mr. Bonham (son) and 4 more for the post-prod including dither.

    6. Have you ever heard a well-protected CD (kept in its jewel case) sound better after some gel application? If a CD needs to be cleaned, I believe one does not need any “audiophile “gels” or other “enhancement” liquids” but only distilled water. Furthermore I would think that “pits” have to be badly dirty to cause the 0s and 1s not to be read correctly.

    Reply
    • True…for the recording side. But the delivery side post mastering is where we lose it.

      Reply
  • Bob Walters

    Because of the nature of sigma-delta converters, one cannot make a direct comparison between DSD and PCM. An approximation is possible, though, and would place DSD in some aspects comparable to a PCM format that has a bit depth of 20 bits and a sampling frequency of 96 kHz.[22] PCM sampled at 24 bits provides a (theoretical) additional 24 dB of dynamic range. – Wikipedia “DSD”

    I have no axe to grind, but equating DSD with PCM44 seems a stretch. Maybe 24/88?

    Bob

    Reply
    • You might convince me that the dynamic range is improved with DSD over CD…but the fact remains that no recording are delivered with lots of dynamic range. The frequency thing, I’m holding tight. The high frequency shifted noise is present at 23 kHz.

      Reply
  • I have a quibble regarding Q9. While a proper “flat” triangular dither will bring the measured dynamic range of a CD down to 93 dB, a noise-shaped dither will improve the perceived dynamic range (as in how the auditory system “hears” it) by some amount resulting in a dynamic range greater than 96 dB depending on the dither method–while at the same time worsening the measured dynamic range.

    Also regarding the answer to Q9, you say “dither is a … technique for removing or masking…” “removing” and “masking” are different things. Dither does not “mask” quantization noise at all, but it certainly does remove quantization error. “Masking” implies that the error is still there but covered up. But really, a proper dither will remove the quantization error entirely at the expense of the added low-level (inaudible under standard listening conditions) noise. I’ve seen audiophiles argue that dither “simply masks” the quantization error by burying it in noise, but nope. It actually removes the error.

    Reply
  • Mark,

    Regarding Q8. How do you define resolution? If we reduce the noise floor, don’t we gain “clearer” data? hence resolution?

    Reply
  • Can you please explain a bit more no. 8? For me it’s counterintuitive that increasing the number of available bits does not improve the fidelity of the recording. Great post anyway.

    Reply
    • The benefit from increasing the number of bits in a PCM digital word is a lower noise or put another way an increase in the potential dynamic range. I have authored several posts and my friend John Siau from Benchmark Media has provided great insight into this topic as well. Click here to read my post.

      Reply
  • Roger Jones

    DYNAMIC RANGE

    Hello Mark,

    I have been following your posts for a while now and found them very interesting and informative. I think we are all after the same thing, better sound. The issue remains how to get it, regarding both technical and marketing issues.

    The main quibble I have with what you write is your use of the term dynamic range. Correct me if I am wrong; you define the dynamic range of any specific recording the difference between the maximum and minimum of the amplitude envelope during the time of the recording.

    Since I was an audio equipment designer in the 1960s, I defined dynamic range as the difference between the loudest and softest tones the human ear can detect in a recording. For example, a person can hear a violin playing very softly in the presence of very loud, sustained percussion. I contend this definition is the one that is important in determining how many bits it takes to make a faithful recording. The soft violin with its even lower harmonics must be accurately reproduced even in the presence of other loud instruments.

    When talking about types of recording and playback, I use the term signal-to-noise ratio to quantify the difference between the maximum signal before overload to the noise floor. The dynamic range can be somewhat greater if the minimum detectable signal is below the noise floor, the nature of the noise being random (Gaussian) and broadband. Perhaps I nitpick because these distinctions are important to instrumentation engineers like me. Depending on how low (or high) the noise floor of a recording is, I have found the ear can often hear instruments that are playing below the noise floor. A very fine playback system can spatially separate an instrument from the background noise.

    Roger

    Reply
    • Interesting Roger. I’m running this morning but will get back to this comment soon.

      Reply
    • I’ll be interested in Mark’s thoughts on this of course, but I think you bring up several insightful points, Roger. In the area of music measurement, I don’t believe there is a universally accepted definition for “dynamic range” yet, so I would recommend anyone using such a term or similar to define when used. I also agree (or perhaps prefer) a dynamic range definition for music that reflects audibility impacts instead of just signal amplitude. I believe that’s consistent with perhaps the best source of international standards on the concepts of loudness and the like – ITU BS.1770, which includes EBU R128 measures for “peak”, “loudness” and and “loudness range” (aka LRA). There is no BS 1770 measure (yet) for dynamic range, but a few credible people use something called “PLR” (peak to loudness ratio), which is essentially a crest factor measurement that uses the BS 1770 definitions for peak and loudness, and is thus audibility based instead of just signal amplitude-based like TT DR.

      Mark, hopefully you recall I pontificated in a similar manner when wondering if or how your upcoming HRA database would define and measure dynamic range-related info that, at least for me, is perhaps the most critical factor in deciding to by HRA versions of CD quality music I already own. I realize that any measure that can’t be efficiently performed for a large number of songs is not very practical, so there are tradeoffs to consdier in this area. Fascinating topic.

      Reply
      • Thanks for the input…I can see that I’m going to have to look into this more.

        Reply
    • Separate from dynamic range definitions, my understanding of “noise floor” and the audibility of sounds below the “noise floor” is different than yours Roger, but I’m no expert in this area. In the past when friends had claimed that they could hear music below the noise floor, I was able to trace down the cause of that to different definitions of “noise floor”. More specifically, many people tend to define “noise floor” as an A-weighted or RMS or similar spectrum-wide measure, and I define “noise floor” as a plot of frequency and decibels. In every case I encountered when someone claimed that they could hear music below the noise floor, what they really meant was that the frequency content of what they were hearing was above the noise level at those same frequencies, even though the A-weighted or other spectrum-wide measure showed a higher dB than the music sounds in question. In short, I don’t believe anyone can discern music content below the noise level at that same frequency. It gets tricky when you might hear harmonics of a fundamental tone in music, such that the frequency profile of the noise floor at the fundamental frequency is higher for the fundamental, but below for the harmonic(s), so one could hear the harmonic content above the noise floor for the harmonic frequency(ies).

      Reply
      • I’m with you on this…no one can perceive or hear music elements that are lower than the noise floor.

        Reply
  • 6. If you accept that some of these products can improve reading of the pits, then all are true. If the laser sled has less hunting to do, then its servo is producing less electrical noise. It may also reduce jitter. In a one box CD player, both have the potential to improve the sound. That’s the theory, anyway this comment should not be taken as advocating such tweaks.

    8. I’m still waiting to read your definition of resolution that causes this question to be false. Every reference I’ve been able to find – whether related to audio or analog to digital conversion in general – is consistent about more bits equaling greater resolution. Being able to discriminate smaller fluctuations within the same range – e.g. -10 to +10 V – is consistent with the usual description of resolution in a measurement system. (Noise shaping is an free. Increasing resolution in one frequency range costs you resolution elsewhere. Adding dither has a number of benefits, but it doesn’t increase resolution either.)

    Your definition of HRA is different than that of the CEA and DEG, but we can still have intelligent conversations about it, because you have clearly defined what you mean by it. If you are going to use a nonstandard definition of resolution, you should define it equally clearly. Until you do, I think you’ll continue to get pushback on this assertion.

    Reply
    • I would point you to John Siau’s paper at the Benchmark website. The additional of more bits doesn’t increase the resolution of a sampling system, the lower the noise floor. This is different than having more amplitude levels within the same amplitude range.

      I’m currently traveling in Grand Rapids, Michigan for the weekend…and have limited time.

      Reply
  • Andrea

    Increasing the number of bits does give you the ability to record more amplitude levels in the same amplitude range within a given bandwidth. If you think this isn’t true, I suggest you go back to the beginning on learning how analog-to-digital converters work. In many cases, the least significant bits are just recording random fluctuations in the analog noise floor of the system. That is still an increase in resolution of the digital system.

    I find Mr. Siau’s white papers and application notes to be generally well-written and informative. I would also put more faith in what he writes than in what is put out by other audio designers. However, in the article to which you are referring, he doesn’t define resolution either, and cites no references. That makes the article far from authoritative.

    Reading through the comments over all of your related posts, I see I am not the only person with a technical background who has trouble with your statement Perhaps if you, or he, can be more specific as to why you believe the ability to track smaller signals within the same overall amplitude range does not constitute higher resolution, we’d find that we all agree.

    Reply
    • I’ll think we’re discussing different ideas here. I trust John on this point that increasing the number of bits doesn’t give you additional resolution…at least how we define it.

      Reply

Leave a Reply to Dave Griffin Cancel reply

Your email address will not be published. Required fields are marked *