Dr. AIX

Mark Waldrep, aka Dr. AIX, has been producing and engineering music for over 40 years. He learned electronics as a teenager from his HAM radio father while learning to play the guitar. Mark received the first doctorate in music composition from UCLA in 1986 for a "binaural" electronic music composition. Other advanced degrees include an MS in computer science, an MFA/MA in music, BM in music and a BA in art. As an engineer and producer, Mark has worked on projects for the Rolling Stones, 311, Tool, KISS, Blink 182, Blues Traveler, Britney Spears, the San Francisco Symphony, The Dover Quartet, Willie Nelson, Paul Williams, The Allman Brothers, Bad Company and many more. Dr. Waldrep has been an innovator when it comes to multimedia and music. He created the first enhanced CDs in the 90s, the first DVD-Videos released in the U.S., the first web-connected DVD, the first DVD-Audio title, the first music Blu-ray disc and the first 3D Music Album. Additionally, he launched the first High Definition Music Download site in 2007 called iTrax.com. A frequency speaker at audio events, author of numerous articles, Dr. Waldrep is currently writing a book on the production and reproduction of high-end music called, "High-End Audio: A Practical Guide to Production and Playback". The book should be completed in the fall of 2013.

20 thoughts on “A Guest Post About High Sample Rates

  • December 15, 2014 at 6:00 pm
    Permalink

    Interesting. I was talking about this topic with a friend the other day, and we began wondering if DACs actually do approximate the Sinc function when filrering. I think its time for me to read those referenced papers and read up on some theory (and measurements) of DACs.

    Reply
  • December 15, 2014 at 9:04 pm
    Permalink

    At last an industry person, albeit an engineer rather than a marketing man, joins in to tell the truth. I’ve been trying to convey the same since my first comment in this blog, a few posts ago. The so called ‘time domain considerations’ are a red herring joining the ‘not enough samples to produce a smooth line’ one in the quest for more $$ for the industry. The sinc function is the mathematical brickwall low pass filter btw.

    Reply
    • December 15, 2014 at 9:18 pm
      Permalink

      PS. There ARE well known and discussed practical engineering problems (i.e. introduction of distortion) in implementing the filter in electronic DA converters, which can be overcome or mitigated to very high frequencies by internally oversampling to very high rates, but this should neither be taken as a limitattion of the theory nor as justifications that higher orifginal sampling rates are required to avoid information loss below the Nyquist frequency.

      Reply
    • December 15, 2014 at 10:03 pm
      Permalink

      PS2. I just read this http://www.hifiplus.com/articles/mqa-its-about-time/?utm_campaign=Hi-Fi%2B+Weekly+Emails&utm_medium=email&page=2&utm_source=email-323, an attempt from someone to provide a plausible explanation of what MQA is all about. I’m not commenting on all the neuroscience stuff, because it is IMO unimportant in the context of our discussion here. The important bit to notice is the quote ‘…In overly-simplistic terms, when we sample a piece of music in PCM, we work to the frequency domain and bring the time domain along for the ride. Traditionally this has been no problem, because the response time for a human brain to process tones is slower than any potential inter-sample timing errors…’ Once again the author demonstrates an inexcusable lack of understanding of how the A-D-A conversion chain operates, when he talks about inter-sample timing errors… I’m sure we are going to hear much more of this mumbo-jumbo in the very near future.

      Reply
      • December 16, 2014 at 3:27 pm
        Permalink

        I’ll have to take a look. Running right now.

        Reply
    • December 15, 2014 at 11:30 pm
      Permalink

      PS3. Below quote is from Dan Lavry’s excellent paper http://lavryengineering.com/pdfs/lavry-sampling-theory.pdf
      summarizing what I have been pointing out all along, namely that the distinction between frequency and time domains is wrong since they are one and the same thing. The only difference is the ‘window’ used to look at it. People pointing to time domain considerations whoch are not addressed by frequency domain processing either do not understand signal processing theory and Fourier transforms or do it to intentionally mislead.

      ‘So if going as fast as say 88.2 or 96KHz is already faster than the optimal rate, how can we
      explain the need for 192KHz sampling? Some tried to present it as a benefit due to narrower impulse response: implying either “better ability to locate a sonic impulse in space” or “a more analog like behavior”. Such claims show a complete lack of understanding of signal theory fundamentals. We talk about bandwidth when addressing frequency content. We talk about impulse response when dealing with the time domain. Yet they are one of the same. An argument in favor of microsecond impulse is an argument for a Mega Hertz audio system’ (i.e. an audio system capable of delivering MHz frequency response…)

      Now pls. tell me to finally shut up and do something more constructive, like going to work. It’s morning here.

      Reply
  • December 15, 2014 at 9:06 pm
    Permalink

    Great post!
    If I understood a word of it, it would be even better I’m sure. LOL

    Reply
  • December 15, 2014 at 11:26 pm
    Permalink

    Hej Mark,

    Once again, greetings from the far north of Sweden, where all 9 million of us read your daily posts with joy in our hearts!

    Just a quick note… I have commented a few times how important it is to us readers that your message is disseminated to us hoi polloi. I know you fear repeating the facts of sound reproduction, but this stuff cannot be repeated often enough! This particular post seems to me a break through of some sort… a quote from one of the insiders of international commercial music who would have every reason to remain silent, instead not only openly supports your mission, he engages scientifically to help dispel the myths of personal opinion or, worse, subjective truth in recorded sound reproduction.

    We in Sweden regard this as a sign that you are indeed being heard world-wide and that your work matters a lot and to many! Keep it up, fight the power and let us all bask our selves in the sufficient warmth of 96 kHz!!

    vänliga hälsingar,
    bill

    Reply
  • December 16, 2014 at 2:34 am
    Permalink

    I believe this article is explaining why sub-10ms time shifting adds nothing.

    I must admit to not understanding the points being made.

    If the ear can detect 5ms difference as to when sound arrives, surely a higher sampling rate must help? Logically, this should suggest at least 200KHz – so, surely anything around this or a greater sampling rate must be good?

    Perhaps, someone can explain this using non-technical language?

    Thanks

    Reply
    • December 16, 2014 at 3:28 pm
      Permalink

      Julian…I will write about this and try to get it down in plain English. The fact is that we don’t need any higher than 96 kHz/24-bits.

      Reply
    • December 16, 2014 at 9:08 pm
      Permalink

      Julian,
      the essence is that what you call time shift (in the time domain) is the same as a phase shift in the frequency domain. What one may call an impulse in the time domain is the same as a rectangular wave of very high frequency in the frequency domain. There are NO phenomena in the time domain which are not reflected in the frequency domain. The theory goes that ALL information, including phase shifts, impulses or whatever have you that lie BELOW half the sampling frequency can be accurately captured and represented. There are no strange phenomena that happen below that frequency that are not captured. So the question remains the same. How high a frequency can be heard or sensed by a person? Asking for a 200Khz sampling rate is exactly equal to suggesting that humans can hear frequencies up to 100Khz. One can not have it both ways (i.e. agree that about 20Khz is the limit of human hearing ,which we generously expand to about 40Khz, while at the same time suggest that there are ‘phenomena’ which happen at 100Khz which need to be captured because they can be ‘heard’). Now it is easier to convince me that you are a bat or you have radar ears than convince me that you aren’t but for some reason such a high samplng rate is required. One cannot have his pudding and eat it too.

      Reply
      • December 16, 2014 at 9:20 pm
        Permalink

        Just to add, in case this is still misunderstood, that ANY ‘time shift’ of a wave, which means any phase shift, however short that may be (meaning however small the phase shift) is ACCURATELY captured as long as the wave itself can be captured (i.e. is below half the sampling rate). Translating a time shift to frequency is NOT correct. Time shifts should be translated to phase shifts. Sampling provides CONTINUITY of capturing as long as what is captured is below half the sampling frequency.

        Reply
        • December 17, 2014 at 1:56 pm
          Permalink

          This site is a wonderful resource for countering the spurious claims out there in the hi-end audio world, not only for Mark’s yeoman work in generating his daily columns (SO appreciated!), but also for such intelligent responses as Nik’s. My only question is, how can so many in the hi-end world be so deluded? – this DSD worship (not least because of its “superior transient response” – a claim you encounter often on places such as SA-CD.net) is sheer madness! (Again, this is not to say that some great DSD recordings haven’t been produced, but jeez. . . !)

          Reply
          • December 17, 2014 at 5:16 pm
            Permalink

            The truth is out there but many refuse to acknowledge it.

    • December 18, 2014 at 4:19 am
      Permalink

      Julian, if I may have a try in plain English. If you sample a signal at 10 kHz or at 100 kHz, the latter may get the *shape* of the signal more accurately (because some of the signal is at frequencies too high for the 10 kHz sampler), but it won’t get the *position* of the signal on the time axis any more accurately — and that position is the ‘timing’. In fact, even at 10 kHz or even 1 kHz sampling, the timing of a digital sampling and reconstruction system is within a few nanoseconds.

      Humans are actually very poor at detecting the timing of a single channel of audio, but we are good at detecting the relative timing of two channels. Digital audio can *get* that relative timing very, very accurate no matter what the sampling rate.

      cheers

      Reply
  • December 18, 2014 at 3:05 am
    Permalink

    Nik, Are you saying that there is a link between the freq. and signal delay between our two ears?

    That is, although we can recognise a sub-10ms delay, this is irrelevant as it only applies to frequencies of 100Khz (1000/10).

    If so, given most people can’t hear above (15-20khz), then ignoring all other issues, does this mean that having a sampling rate of more than, say, 40khz doesn’t make sense when is comes to signals delays only?

    How do we then know that the brain can recognise 5ms delays, if we cannot hear the source?

    Linked Comment:
    I have read ‘music energy’ has been measured to around 100khz. Surely then, a 96khz sampling is too low, unless you are okay with the principle that some of the music is removed?

    I understand that such an imposed ceiling is one of the main concerns regarding DSD – although I suppose, this argument begins to dissipate with DSD256.

    I do not mind if a recording is in DSD or PCM. Actually, most of the time my choices are determined by the recording (classical piano).

    However, previously, I would typically buy a 192khz rather than 96khz versions because of the higher ‘music energy’ ceiling and the improved resolution of the signal delay (which appears to be untrue?).

    (I also admit to buying 384khz tracks from 2l… and yes, they sound magnificent…)

    Mark, I also welcome your non-technical response, if at all possible, regarding our spacial awareness of the sound source/the brain’s capacity to recognise signal delays.

    Thank you.

    Reply
    • December 18, 2014 at 10:35 am
      Permalink

      Check out today’s post. I purchased a new DXD recording from Promates and posted the spectrogram. It’s very interesting and shows the idiocy or 352.8 kHz audio…everything above 45 kHz is noise.

      Reply
      • December 18, 2014 at 12:05 pm
        Permalink

        Julian, as I tried to convey, the signal delay is manifested as a phase shift not a frequency change. Such a phase shift is represented in a freq diagram by the wave being displaced to the right on x-axis (time axis) of a frequency against time plot. Absolutely nothing to do with frequency in this case. A pulse on the other hans is manifested as a short square wave. If this pulse in below the Nyquist freq then is it going to be sampled correctly, otherwise it is not as with any wave.

        Reply
    • December 18, 2014 at 5:08 pm
      Permalink

      Hi Julian, the way they tested to find that we can detect a timing change of as low as 5 micro seconds (5 us), was to play an audible signal through 2 channels so that a phantom source was located between the speakers (or headphone drivers), then delay one channel very slightly until the subject can barely detect if the phantom source has moved. It has nothing to do with high frequencies; one can run the test with any frequency that we are good at locating — pretty much anything over about 500 Hz.

      And, as I explained yesterday in another comment above, digital audio gets the timing of the two channels right to within a few nanoseconds (a thousand times faster than 5 us), no matter what sampling rate is used. So, it is a non-issue, i.e. it is not a reason to up the sampling rate.

      cheers

      Reply
  • December 18, 2014 at 11:44 pm
    Permalink

    Grant, Mark, Nik – thanks!

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

three × two =