Dr. AIX's POSTS

Coming to Terminology: Defining Quality Not Sources

The discussion of high-resolution terminology requires a common set of terms, assumptions and basic concepts. It should really mean something when you say something is high-resolution or ultra high-resolution (I saw that one yesterday associated with a release from Reference Recordings…a PCM recording released at 176.4 kHz/24-bits).

Over the course of 10 years or so, I’ve talked about high-resolution terminology and definitions with hundreds of people from audio engineers, to audiophiles and individuals from the consumer electronics companies. Thus far, I’ve been unable to get a consensus even among professional audio engineers. Unlike the working groups that found common ground on defining the various formats in video delivery (standard definition, high-definition and now ultra high-definition), music and audio engineers and other interested parties have not been able to establish meaningful and absolute definitions. This is a problem.

I believe that the terminology has to be universal, convenient, simple and communicate the essential quality associated with a particular recording. Yesterday, I dismissed the “Studio Master” term embraced by some as being meaningless. Don’t we want the descriptive term or a logo to mean something qualitatively? I think so. And the term “Studio Master” doesn’t do that…it is more closely aligned with the provenance of a given recording; Unless you know the history of audio and the equipment and procedures that were employed in the production of individual projects, you can’t know what level of quality to expect with a generic term like “Studio Master”. There is value to this descriptive/historical designation…I’ll come back to it tomorrow.

Ideally, we want a term that helps us to identify the quality of the experience that we’re likely to have when listening to a track, not some reference to the output of the studio that mastered the recording. Additionally, we can’t rely on the specification numbers or the format to have any meaning with regards to fidelity. Just because something is delivered in a 384 kHz/32-bit data bucket doesn’t mean that it meets the potential of a PCM recording of those specifications.

So I start from a different place than most advocates for High-Resolution Audio. At the present time I believe there should be three quality designations and within them categories for both source and delivery formats and specifications. Here are the three major categories:

High-Definition or High-Resolution – Recording and playback systems that have the potential to meet of exceed the capabilities of natural human hearing. New recordings done at 96 kHz/24-bits or better with the intent to maximize fidelity.

Standard-Definition or Standard Resolution – Recording and playback systems that have the potential to capture and reproduce a frequency range of 20 kHz and 60-90 dB of signal to noise ratio. Think analog tape, vinyl LPs and compact discs.

Low-Definition or Low Resolution – Recording and playback systems that have the potential to capture and reproduce a frequency range of 15-18 kHz and 20-40 dB of signal to noise ratio. Here we’ll have 128 kbps MP3, HD-Radio, AAC and other heavily compressed (data compression) formats.

I’m being very careful to describe the “potential” of each level…because it is possible to use an HD-Audio container to deliver a low-definition/resolution recording. In fact, it happens all the time…most of the commercial recordings that are made available through iTunes fall into this category.

An analogy would be the quality of a family 8mm movie shot back in the late 50s after it has been telecined to an ultra HD-Video format container. The quality of the source video is still that of the 50’s era film.

We have to maintain the distinction between the potential fidelity of the source and the potential of the delivery format. It makes little sense to transfer a DSD 64 recording will all of its inherent “out of band” noise to a 192 kHz/24-bit PCM file…although there are plenty of vendors doing exactly that.

To be continued…

Dr. AIX

Mark Waldrep, aka Dr. AIX, has been producing and engineering music for over 40 years. He learned electronics as a teenager from his HAM radio father while learning to play the guitar. Mark received the first doctorate in music composition from UCLA in 1986 for a "binaural" electronic music composition. Other advanced degrees include an MS in computer science, an MFA/MA in music, BM in music and a BA in art. As an engineer and producer, Mark has worked on projects for the Rolling Stones, 311, Tool, KISS, Blink 182, Blues Traveler, Britney Spears, the San Francisco Symphony, The Dover Quartet, Willie Nelson, Paul Williams, The Allman Brothers, Bad Company and many more. Dr. Waldrep has been an innovator when it comes to multimedia and music. He created the first enhanced CDs in the 90s, the first DVD-Videos released in the U.S., the first web-connected DVD, the first DVD-Audio title, the first music Blu-ray disc and the first 3D Music Album. Additionally, he launched the first High Definition Music Download site in 2007 called iTrax.com. A frequency speaker at audio events, author of numerous articles, Dr. Waldrep is currently writing a book on the production and reproduction of high-end music called, "High-End Audio: A Practical Guide to Production and Playback". The book should be completed in the fall of 2013.

9 thoughts on “Coming to Terminology: Defining Quality Not Sources

  • Bill Brandenstein

    Mark, you’re doing everyone a huge service by exposing the bad engineering and faulty sales logic behind so much of this junk, when in fact audio should consistently sound and test out better than anything in history. You are bringing new facts and challenges to light pretty much daily, for which I am grateful. Last night I discovered something that illustrates your cause perfectly.

    I never invested in DVD-Audio equipment, and have only a pair of discs in my collection playable to me as Dolby Digital only. So I installed DVD Audio Extractor and gave it a try, and opened a handful of the resultant multichannel wave files in Adobe Audition CC to see what there was to see. I was not pleased. For example:
    – Teldec’s Beethoven Ninth Symphony is labeled as high resolution 24/96 5.0 audio. That’s the delivery format only, however, as the complete lack of information above 24Khz indicates that it was recorded at 48Khz. Speechless!

    – I found one “demo” track for other recordings with incorrect channelization, so the center and sub channel were duplicates of the surround left/right.

    – Chanticleer’s Palestrina disc was at least recorded at 96Khz, the spectrum proves it. Yet, on one of the tracks a sampling error (bad D/A setup, perhaps?) occurred on only one channel (front left) so that all the content above 24Khz was a perfect full-volume alias of the lower audio spectrum. Wow.

    – Another demo track has bizarre ultrasonic bands of noise, low level enough to be insignificant, but obvious enough on a spectral view. More odd, the frequencies of the noise shift up and down over time, making the edit points of the recording quite obvious. I guess that could happen to anyone, but it’s an engineering question nonetheless.

    And that was just from looking over a fraction of the content of those discs.

    Keep up the good work.

    Reply
  • Blaine J. Marsh

    Mark,

    Maybe something akin to the first generation CDs where we had AAD, ADD and DDD referring to the format of the recording, the mixing and the physical media. I’d be happy with HD/HD, SD/HD and SD/SD. The first designation would be the source material and the second would be the playback format. SD would be the bandwidth of human hearing and a SN ratio of > 60db and <= 96db. HD would be a bandwidth that extends at least an octave beyond human hearing and SN ratio greater than the threshold of pain. Obviously, the source quality is something else entirely.

    Blaine

    Reply
  • I think your three categories are certainly a fair start considering they’re all existing formats. It certainly should be recognizable to audio engineers across the board as an attainable value for each level. The studios and the artists should find it acceptable as it makes total sense to record within a format that provides the full dynamics and listenable hearing range their music can achieve. The battle really begins when it reaches the labels and the audio manufacturers that sit in their pockets….or is it the other way around. It’s hard to tell! The far east contingencies have very deep pockets and a lot of influence when it boils down to the chipsets that are ultimately needed to make any audio enhancement available to the mainstream. Companies like Sony will take chances at higher resolution formats simply because they can (and they own labels). Small, independent companies that explore the capabilities of higher resolution beyond 96/24 are truly amazing innovators, but none can come even close to the mover and shakers that dominate the industry. Or more to the point, have any influence on the CEA. This is certainly not a commentary against the big boys because at least they are realistic when it comes to bringing a format to market to appease the masses. If higher resolutions are to be mandated, its the big guns who are going to do it. You can snake oil all you want, flex your intellect and concoct fascinating stories about resonant frequencies beyond the range of any living organism but until you get real, you’re wasting your time.

    Reply
  • Gerald

    Mark, I like this and wish it was a standard now…

    High-Definition or High-Resolution – Recording and playback systems that have the potential to meet of exceed the capabilities of natural human hearing. New recordings done at 96 kHz/24-bits or better with the intent to maximize fidelity.

    Standard-Definition or Standard Resolution – Recording and playback systems that have the potential to capture and reproduce a frequency range of 20 kHz and 60-90 dB of signal to noise ratio. Think analog tape, vinyl LPs and compact discs.

    Low-Definition or Low Resolution – Recording and playback systems that have the potential to capture and reproduce a frequency range of 15-18 kHz and 20-40 dB of signal to noise ratio. Here we’ll have 128 kbps MP3, HD-Radio, AAC and other heavily compressed (data compression) formats.

    I would love it if I knew the BD and SACD recordings (HD) I buy were recorded 96/24. I would expect to hear differences between recordings, but it would be nice to know that the recordings at least have the potential to be good if the recording is done correctly.
    I doubt if that will happen because it will be too temping for some to put Standard resolution audio on the High resolution standard. You have shown us that this is happening now. Without a spectrograph to analyze the recording, it would be impossible for the masses to know for sure. I would be able to tell by listening but I have spent a great effort on my system and training my ears what to listen for (Like the mechanic who doesn’t have so-called Golden ears and has hearing like everyone else, but still knows what to listen for). So I am pessimistic as you are in hoping for a bright future. Gerald Pratt

    Reply
    • I’m going to try and flush these categories out a little more. And I may actually started a database of spectragrams of every available “High-Resolution” download so that people could actually check before purchasing.

      Reply
  • A few thoughts:

    1) A severely compressed, high background noise or heavily clipped song meets my definition of low fidelity, regardless of the song’s recording or playback resolution in bit depth or sampling rate. I don’t think this is a nit, because senseless compression is one of the most pervasive and destructive acts in the music production industry. So I’ll be interested to see how these considerations are factored into your emerging definitions.

    2) Not that you have yet, but going forward please don’t lump all lossy files together as automatically low definition or fidelity, because the range of fidelity offered is extremely wide. For example, song encoded to LAME 3.995 V0, V1, or V2 will be audibly transparent to its lossless counterpart except in extremely unusual circumstances. AAC is similarly capable. Perceptual masking is a real phenomenon that allows high fidelity file compression and enables “portable high fidelity”, and I think that’s good.

    In short, for me I look at the combined effects of A)recording, mixing and mastering quality, B)bit depth and sampling rate, C)codec fidelity and D)playback system and environment – and the weakest link in this chain essentially determines the resulting fidelity at my ears.

    As a consumer, my ability to efficiently and quickly evaluate the true fidelity offered by A) on any particular song is essentially nil right now, is what holds me back the most from making new music purchases, and where I hope for the most advancements in helping customers find the highest fidelity music.

    Reply
    • Admin

      You make some really valid points. It’s true that I’ve been focused on the potential of a given technical level of recording and playback. When we also consider issues such as clipping and over use of compression, there is another level of things to considers. For example, some styles of music require heavy compression. And while clipping or excessive levels of background noise could be considered serious flaws, they might be part of an aesthetic that an artist or producer wants. I’ve had the experience of adding noise and distortion to a recording!

      Your second point about lossy compression is actually troubling for me. As good as a “perceptually” encoded file of any type might sound, I would not bring them into the high-resolution audio category. Why accept something that throws away a portion of the audio to meet a bandwidth requirement?

      This is obviously a multidimensional issue. But you’re input is greatly appreciated. The CEA just formed a working group to define “high Resolution audio”…keep your fingers crossed.

      Reply
  • I agree with your artist intent perspective. For that reason if not others, I suggest separate tasks for objectively and accurately measuring key music statistics like compression, clipping and noise floor (perhaps others in addition to or instead of these three), distinct from definitions of low and high fidelity to separate objective versus more judgmental definitions. For the objective measures, I think ease of measurement , standardization and inability to inappropriately manipulate are important, and I suggest pinging DAW app makers for their thoughts on how apps can perform these measures automatically in the production process.

    I also agree that high quality lossy should not be muddled together with lossless. Maybe a category/term like “near hifi” or “hifi lossy” or similar to meaningfully distinguish them without losing concept that high quality songs are still very possible and valued by consumers in portable and mobile environments? I think it would be insightful to have something like this, if for no other reason than defeat any logic that calls for dynamic range crushing in mixes and masters in the name of mobile and portable listening needs (and without muddling data compression with dynamic range compression, of course).

    Keep up the good fight!

    P.S. One more thought: I think to the extent that many people have many different perspectives, education, experience and goals in the music industry, perhaps borrowing from other industries in how they tackled defining “quality” may help. Perhaps consider facilitating sessions with key industry players on how to define “quality” by borrowing materials and tools from Demming, the Toyota Production System, and other quality-focused efforts. In other words, the music industry isn’t the first industry to struggle with the definition and pursuit of quality, and that’s good news – you can shamelessly borrow lessons and techniques. I could see how conducting a few workshops at conferences would be very helpful in surfacing key ideas and key obstacles. In this light, part of the definition of quality for mobile uses will include ease of use, for which good lossy codecs will be invaluable.

    I’d love it if you could get major label participation at any workshop, since they seem to be the “keylog” of widespread, rapid adoption of anything.

    Reply
    • Admin

      It’s going to be a real challenge to come up with a meaningful set of definitions that everyone can agree on. I appreciate your input…very well thought out.

      Reply

Leave a Reply to Matt E. Cancel reply

Your email address will not be published. Required fields are marked *