The High-Res Mafia: Part III

I received a reply from Bob Katz regarding his position on being denied inclusion on HDtracks for some of his productions. Yesterday, I talked about the quality evaluations that HDtracks does to ensure that every track they offer meets their definition of “hi-res” music (they include transfers of older standard-res music from analog tape…I don’t). They verify that the files have 24 “active” bits. I’m not sure what this means when the recordings delivered to them use less than 16 bits of dynamic range…bit at least you’ll get all of the low level his and rumble.

They do three other checks, too. They check the frequency spectrum, they do a real-time frequency check, and they take a spectrogram of the tracks to see if there is “activity in the frequency ranges in decibels and map the dynamic range of the frequency responses”. This all sounds great but it doesn’t guarantee that the source tracks are high-res or not. It simply verifies that the tracks came from an analog source and was converted using a 96 kHz/24-bit ADC or better.

So why does Bob Katz refer to the HDtracks QA team as the “HRA Mafia”? This is what he wrote:

“Regarding HD tracks, there is no supersonic information in material which I master at a higher rate by upsampling it. So the ‘police’ would catch it! My argument is simply that the upsampled material, processed at the higher rate and then NOT downsampled and/or wordlength-reduced — sounds better in that way.

So the audience is missing out by having to listen to two “generations down” from my original master. Simply because they’re looking at an FFT instead of listening. It clearly sounds better for a few reasons:

1) The DACs sound better when they are not doing the upsampling themselves. Weiss Saracon is a MUCH better upsampler

2) Processing at a higher rate produces fewer aliasing and distortion products

3) It’s fewer ‘generations’ that the sound goes through

4) The sound goes through fewer sharp low pass filters

All you have to do is listen…it sounds warmer, wider, better for all of the above reasons.

Nevertheless, as I said in the interview, we can acknowledge that material which was ORIGINATED at the higher rate potentially sounds better than the material which I am mastering from the lower rate…but the listener and the producer and the artist should not be penalized by the mafia which just looks at the FFT with their eyes instead of listening with their ears.

In a perfect world, this material which I am discussing should have its own category…maybe not charge as much as the ‘officially higher res material’…

That’s my take on it!”

So there it is. Bob acknowledges that these particular productions were not recorded at high-res specs. However, he insists that by upconverting them and doing all of his mastering magic at the higher sample rate, the results are better than if he had maintained the original standard resolution rates. And I don’t doubt that the final files sound terrific.

But I agree with David Chesky. They don’t belong on a site that is claiming it only accepts and sells recordings that were captured on analog tape and then converted to high-res specs. I think PonoMusic would embrace the tracks.

The problem remains. We don’t have a uniform definition of what is and what isn’t a high-res track. And I seriously doubt that we ever will.


Mark Waldrep, aka Dr. AIX, has been producing and engineering music for over 40 years. He learned electronics as a teenager from his HAM radio father while learning to play the guitar. Mark received the first doctorate in music composition from UCLA in 1986 for a "binaural" electronic music composition. Other advanced degrees include an MS in computer science, an MFA/MA in music, BM in music and a BA in art. As an engineer and producer, Mark has worked on projects for the Rolling Stones, 311, Tool, KISS, Blink 182, Blues Traveler, Britney Spears, the San Francisco Symphony, The Dover Quartet, Willie Nelson, Paul Williams, The Allman Brothers, Bad Company and many more. Dr. Waldrep has been an innovator when it comes to multimedia and music. He created the first enhanced CDs in the 90s, the first DVD-Videos released in the U.S., the first web-connected DVD, the first DVD-Audio title, the first music Blu-ray disc and the first 3D Music Album. Additionally, he launched the first High Definition Music Download site in 2007 called iTrax.com. A frequency speaker at audio events, author of numerous articles, Dr. Waldrep is currently writing a book on the production and reproduction of high-end music called, "High-End Audio: A Practical Guide to Production and Playback". The book should be completed in the fall of 2013.

28 thoughts on “The High-Res Mafia: Part III

  • Jochen Semler

    It would be advisable if HD Tracks would also include a listeneing evaluation of the albums that they offer for download.
    I downloaded already a few albums from HD Tracks that sound so poor that I just can’t listen to them.
    If HD Tracks would follow the specifications of JAS from June 2014 (when the HighRes Logo was still meant for software also), listening evaluation would be mandatory.

    • They could add a lot of additional information…DR rating as well as a subjective assessment. But it all takes resources. They do a reasonable job but the labels are the ultimate source for the information.

      • Yes it takes resources but it takes all of about 10 seconds at most to run a tmeter on a complete album. There can be no excuse for a company with quality standards not to at least report that the Jackson Brown – Running On Empty album they are charging $19.98 for, has had the DR squashed to a 7 when the Red Book CD measured DR13. Who ever is responsible is irrelevant, don’t point the finger at the label., Selling the squashed file as a sonicly improved HDA product at a premium price without reporting the fact, I find bordering on criminal.

  • Soundmind

    THERE IS NO SUCH THING AS HIGH RESOLUTION AUDIO! It’s a made up marketing term that makes no sense. Look it up in Wikipedia, if it says it in Wikipedia it must be right. 🙂 But the truth is that it is not a scientific term that describes the products called Hi Rez Audio at all. Here’s why. Resolution as a scientific term refers specifically to optics and related images in photography that has to do with how closely two objects can be perceived as being separate before they become a blur to being seen as one. Therefore, lenses, telescope mirrors, film, even CCD imaging devices can be rated according to their resolution, usually in arc seconds. So if one telescope shows the observer two separate stars while another shows them as one, the first has higher resolution or greater resolving power. If one photographic film can photograph not just visible light but ultraviolet and infrared as well and can photograph images clearly in light ranging from a fraction of one candlepower to looking directly into the noonday sun but produces grainy fuzzy images all the time, its resolving power is poorer than ordinary film that produces sharp images but only in the visible light range and only in normal lighting conditions. So while what is called high resolution audio may have wider dynamic range and greater frequency range than ordinary CDs, it does not have greater resolution. It is not an extrapolation of this scientific definition related to light into the field of sound.

    What would increased resolving power mean in audio. In amplitude it would have to be able to increase loudness in smaller increments than RBCD that are audible. But over about a 95 db range RBCD gives you over 16,000 loudness levels already. Even if your hearing is as sharp as Atkinson claims his to be, able to hear in 0.1 db increments, over a 95 db range that would be less than 1000 loudness levels. What about frequency resolution? My own resolution is about 1/4 to 1/8 of a halftone (a halftone is about a 5% frequency shift.) So my hearing has a frequency resolution of around 1%. Some people’s are somewhat better. This is the ability to tell if a note is sharp or flat. RBCD beats that by a country mile so there’s no discernible difference there. Vinyl by contrast is much poorer, especially for rim driven turntables with worn idler wheels and belt driven turntables with worn belts. That results in slippage and heaven help you if you get even one drop of oil on the belt or idler wheel.

    So the so called high resolution audio product has wider frequency response capability if it’s in the signal but you can’t hear it, and greater dynamic range but you almost never need it. In fact most pop music has dynamics so compressed even RBCD is unnecessary. Compressed as that music is, recording engineers compress them even more and boost the gain to where they overload the RBCD system so they will be louder on a radio. They don’t care. For their target market compression and distortion are not a concern. For that kind of recording IMO there’s only one correct loudness for me….OFF.

    From above “Regarding HD tracks, there is no supersonic information in material which I master at a higher rate by upsampling it.”

    Supersonic means faster than the speed of sound. Ultrasonic means above the frequency range of human hearing.

    • Thanks Mark for your informed comment. I always push for High-Definition Audio/Music but the consensus went the other way. However, I’m not with you on the “16,000 loudness” levels breakdown. The number of bits isn’t a direct measure of decibels (or fractional portions of decibels) as you assert. Going from 16-bits to 24-bits does increase the number of discrete values that can be used by the sample system but it’s far more complex than a straight conversion from bits to decibels (ADCs do heavy unsampling and operate in the delta sigma domain) to digitize an analog signal.

      The same is true of sample rates and the Just Noticeable Different with regards to pitches. Higher sample rates have nothing to do with increased accuracy of individual pitches.

      There are recording that measure better than CDs and it is important. But you’re right that HRA as currently marketed is complete BS.

      • Soundmind

        I was sure this posting would get you angry, maybe even get me thrown off this site. Seems I can’t get a rise out of anyone anymore no matter what I say. That’s what happens when you get old, you just get ignored like some crazy fool.

        I wish my posting was better informed. I tried to do a little research about RBCD standards. I wasn’t sure if the actual data for the amplitude signal was 14 bits with two reserved as a control track or the full 16 bits. I’ve read a lot of misinformation in my life. Here’s what I was able to glean;

        “Each audio sample is a signed 16-bit two’s complement integer, with sample values ranging from −32768 to +32767. The source audio data is divided into frames, containing twelve samples each (six left and right samples, alternating), for a total of 192 bits (24 bytes) of audio data per frame.

        This stream of audio frames, as a whole, is then subjected to CIRC encoding, which segments and rearranges the data and expands it with parity bits in a way that allows occasional read errors to be detected and corrected. CIRC encoding also interleaves the audio frames throughout the disc over several consecutive frames so that the information will be more resistant to burst errors. Therefore, a physical frame on the disc will actually contain information from multiple logical audio frames. This process adds 64 bits of error correction data to each frame. After this, 8 bits of subcode or subchannel data are added to each of these encoded frames, which is used for control and addressing when playing the CD.

        CIRC encoding plus the subcode byte generate 33-bytes long frames, called “channel-data” frames. These frames are then modulated through eight-to-fourteen modulation (EFM), where each 8-bit word is replaced with a corresponding 14-bit word designed to reduce the number of transitions between 0 and 1. This reduces the density of physical pits on the disc and provides an additional degree of error tolerance. Three “merging” bits are added before each 14-bit word for disambiguation and synchronization. In total there are 33 × (14 + 3) = 561 bits. A 27-bit word (a 24-bit pattern plus 3 merging bits) is added to the beginning of each frame to assist with synchronization, so the reading device can locate frames easily. With this, a frame ends up containing 588 bits of “channel data” (which are decoded to only 192 bits music).

        The frames of channel data are finally written to disc physically in the form of pits and lands, with each pit or land representing a series of zeroes, and with the transition points—the edge of each pit—representing 1. A Red Book-compatible CD-R has pit-and-land-shaped spots on a layer of organic dye instead of actual pits and lands; a laser creates the spots by altering the reflective properties of the dye.”


        I’m sure everyone understood it. It’s the Wikipedia kindergarten version of the story for tyros (like me in this particular area.)

        There’s a lot more to it. Phillips wanted a 14 bit format because they had just developed a 14 bit D/A converter. Sony insisted on 16 for better quality. Phillips figured out how to combine and upsample to get 16 bit quality from their 14 bit chips. So the number of loudness levels is not 2E14 which is 16,000 but 2E16 which is 6400. Encryption is in linear PCM format which still doesn’t explain the wide discrepancy between 6.4*10E5 available loudness levels and 4*10E9 levels required for 96 db. If the levels are too widely spaced to accommodate that dynamic range, considerable distortion would result at lower levels but in reality both THD and IM + Noise are extraordinarily low and input output linearity is excellent. Clearly I’m missing something about how this works but quite frankly it has never interested me enough to find out even though I could if I dug deep enough.

        BTW I’m not an audiophile, at least not in the current sense of the term’s usage. Perhaps I should just have one of the audiophile experts out there explain it to me. There certainly are plenty of them out there all too ready to “inform” me.

  • Camilo Rodriguez

    Hi Mark,

    Happy New Year!!!

    I hope that 2016 will reward your much appreciated efforts in speaking the truth and setting the record straight about digital audio technology, High Resolution Audio, and the fraudulent business practices of the recording and music industry. I’m sure these efforts, along with your upcoming book, will see positive consequences and will eventually make for change. Change doesn’t happen in one day, but it also doesn’t happen if we give up and lose faith.


    I was really impressed – and frankly a bit disappointed – to read Bob Katz’ passage, I wouldn’t have expected a guy of his reputation to come up with magic mastering concoctions that sound “warmer, wider, better”. I’m just as certain as you are, that the tracks sound great, but the qualities he describes have nothing to do with High-Resolution audio. On the other hand, we know Chesky’s High-Resolution standards are pretty much as flawed as those of Qobuz’, as they sell any content as HRA, as long as it is in a 24bit bucket, even though we know the provenance of many albums that they sell as HRA can’t correspond to HRA standards, just by looking at the year they were recorded.


    • Happy New Year to you as well. Bob is a very competent engineers and mastering guy…but his positions and technique are at some variance with my own.

  • Fazal Majid

    I now deeply regret having paid for Katz’ “Mastering Audio – the art and the science, 2nd edition” (did not get around to reading it yet).

    He is either a crook, or completely clueless and in either case the book is not worth the paper it is printed on.

    • Bob is not a crook…but he does have activist views on producing great audio recordings. I’ve read his book…it’s clear and very informative. The art of mastering is a creative process and each person will have their own take on it.

  • Fazal Majid

    HDTracks’ commitment to quality (or at least truth in advertising), is belated, if welcome. I have an allegedly 24/192 album of Miles Davis “Kind of Blue” from them, lured by the sycophantic review in Stereophile. As could be expected, there is just lovingly transcribed noise

    That title is clearly 16-bit (and presumably 44kHz as well) in a 24/192 container.

    • Actually, the “Kind of Blue” album was not derived from a CD. This was one of the few lucky projects worthy of Plangent processing and a new transfer. But it still is a standard resolution project and using 192/24-bits is unnecessary and overkill.

  • Many, many engineers do what Bob does, which is upsample 24/44.1 mixes so they can use their favorite plugins during the mastering stage. The idea is that at 24/96 some plugins are able to process the waveform with more accuracy than at 24/44.1 (read: the math works out a little better with respect to round-off). I don’t deny what Bob is saying in that regard, however that doesn’t make it HRA by any stretch of the imagination.

    There is difference between capturing high-frequency content at the recording phase (truly HRA) and the sample-rate chosen during the post-processing, mastering phase. Mastering at high-res is nice, but that is just a choice and NO PLUGIN he uses will add high-frequency content. He even admits that.

    Moreover, he is just FLAT out wrong when it comes to DACs:

    “1) The DACs sound better when they are not doing the upsampling themselves. Weiss Saracon is a MUCH better upsampler”

    Hey to break the news to you Bob, almost all DACs upsample when they reconstruct. Other than a few esoteric brands, the overwhelming majority of DACs in the studio upsample internally. In fact, you mention Weiss Saracon, and their DAC1 product I believe automatically upsamples to 384k no matter the input (within the DSP). So whether or not you use a sample rate converter during post processing is immaterial to what the DAC may actually do at reconstruction. Also note, delta-sigma is REALLY popular these days with audiophiles, so keep in mind, that means your upsamples PCM signal is going to be converted into a 1-bit delta-sigma bitstream anyway.

    Bottom line: The bit-depth and sample rate you process in is most likely not going to be preserved during playback due to the current state-of-the-art of DAC chipsets these days.

    And here is where Bob just isn’t really thinking this through:

    “Nevertheless, as I said in the interview, we can acknowledge that material which was ORIGINATED at the higher rate potentially sounds better than the material which I am mastering from the lower rate…but the listener and the producer and the artist should not be penalized by the mafia which just looks at the FFT with their eyes instead of listening with their ears. ”


    I (the consumer) don’t want to trust Bob’s ears that this low-res material is now high-res “like” or “equivalent” because he threw through a few plugins operating at 24/96 at it. That’s ridiculous. That goes back to the days where now anyone can just upsample the 24/44.1 master by padding zeros on it and call it PCM DXD. Those folks, who are much less talented than Bob, can now make the SAME claim Bob does but basically “cheat” the system without the “high-res mafia” in play.

    Anyway, who wants to start a Kickstarter project to raise more high-res mafia “protection” money? Who’s with me?

    • I would tend to agree with you. I don’t endorse the upconversion approach to mastering or releasing audio. Any conversion…even the Weiss Saracon…diminishes the quality of the original.

      • I’m a bit confused. If “bits are bits”, why would a conversion degrade the sample?

        • Because the data stream undergoes changes from when represented in another format or specification.

          • Thanks.

          • So given that downsampling degrades the signal, this would indicate that all of the ABX tests, where a hi-res sample is changed to 16/44 or lower, for instance, are bogus. Correct? In that case, how can you make a valid ABX comparison without knowing the true provenance of each of the original recordings at each resolution (and they MUST be the same for a valid test)?

          • You’re correct! It is very challenging to get superior results with existing hardware and software tools. Even simultaneously recording at 44.1 and 96 kHz would be flawed because the converters are different. I even hear Bob Stuart say that every pass of identical data produces different output because of the randommness of the dither. What are we to do? I think null tests come close and I wouldn’t rule out ABX testing.

  • Michael Faulkner

    Mark: I don’t understand the technicalities of your post as “Joe Blow” listener. Sounds like they are trying to “genetically engineer” old recordings and create a pansy from a potato. I think it all has to do with the original master, as I think you have said.

    For example, I started to rip my CD’s on to my cheap FiioX1, using old models Grado 225’s to listen with. I picked a track at random last night (Christie Baron) as was totally immersed in the ripped WAV CD. I wondered why the recording was so magnificently sounding, and then saw it was a Chesky recording. Isn’t that label about simple and direct recording of artists with good miking for a master that really expresses the artist’s music? Perhaps a high res recording would be better, but I was quite satisfied musically with the CD recording rate. Give me a well recorded master, true to life, and I’m happy.

    • The production process is where the fidelity of a track is established…right.

  • Soundmind

    I give a lot of credit to audio engineers like Mark, Bob Katz, Cookie Morenco who have to create recordings the market desires enough to buy. It’s as much art as it is science. Experience matters a lot. They learn what works and what doesn’t. Bob Katz’s beef is that he can’t use the latest marketing hyperbole with the endorsement of a self appointed exclusive club. I’ve suggested he start his own club and invent his own marketing hyperbole. BTW, none of that speaks at all to the merits or faults of particular recordings. Even if the new technology is demonstrably better, that does not mean it can’t be use to make bad recordings. The old rules still apply.

    I think if there’s one thing that sets off alarm bells for me immediately it’s the use of this kind of advertising hyperbole as a substitute for substance to claim superiority. The hobbyist magazines for playback equipment is only too ready to aid and abet this effort by reviewing the greatest preamplifier, speaker, DAC, or what have you in the world for this month in exchange for advertising revenues or hoped for advertising revenues. The reviewers themselves get special concessions from manufacturers and are no more qualified to pass judgment than Joe Blow off the street is.

    I have to give Mark Waldrep credit as an engineer for tempering his enthusiasm for his projects by recognizing the limitations as well as the potential of his different technology. That’s intellectual honesty you don’t often see in this industry much anymore. Most of the claims for superiority where each new variant is proclaimed as a major technical breakthrough are at best relatively insignificant and often merely slightly different from the prior art. Waldrep acknowledges RBCD is an excellent format but does not satisfy the full range of possibilities of human hearing within its purview. This gap is where this new technology claims to fill in although its necessity is dubious.

    Are substantially better electronic musical sound recording/reproduction systems than the best we have now possible? I think they are but not by continuing down the direction most people trying to explore the limits of the boundaries of current systems to extend them are going to find. I think the current concepts have been taken about as far as they can go. Any real improvement will come from an entirely different and unexpected direction. Just my opinion.

  • I have been critical of HDT in the past because they did nothing to combat the problem of upsampled items with a redbook provenance. I am glad they are doing something even if not 100% effective. Bob Katz was wrong to call them mafia. Their simple process caught him at his game. Shame on him. He is a fine mastering fellow, but this is just wrong as he was trying to pull the wool over everyone’s eyes.

  • John Chase

    “The problem remains. We don’t have a uniform definition of what is and what isn’t a high-res track. And I seriously doubt that we ever will”

    Unfortunately, a standard would leave room for the possibility of pricing determinations, perhaps even to the point of dare I say it, regulation.

    I think the fog allow for creating magic definitions for HRA, and pricing things as the market will bear, which is always the most profitable.

    Not that my outlook bends towards what some folks might call cynical.

    Happy New Year to every one, and wishes for a great start with the new additions at the studio!

  • john P robichaud

    ok…so I am just a humble man with a humble listening set-up. but this is my take on it. And anyone can correct me on my opinion. I am a humble man. The ORIGINAL SAMPLING RATE of any music recorded in a studio…THAT RATE is the rate the music can be played back with all of it’s sonic integrity intact. The problem for me is when you take studio a recording sampled at say…the 44.1/16 CD sampling, and then start converting THAT RATE to a higher “UPSAMPLED, rate. you are compromising the sonic of integrity of the music. How??… Because to upsample the original sampling rate, you are ‘STRETCHING” the music out to ‘fit’ that higher sampling rate. Remember you are not ADDING ANY NEW INFORMATION to the original music, you are simply “ALTERING” it to fit the new sonic signature or environment or whatever one wants to call it. So….what is happening by STRETCHING the music is that you are stretching the notes further apart in the music. So someone listens to the new “improved” version and they go “Wow! there is so much better resolution!”.. Because the musical notes are more isolated from one another. .HOWEVER…you are at the same time compromising the Sonic Integrity of the music. This is because each musical note has several harmonic envelopes which give the music a tonal beauty and richness to it. When you STRETCH the music in upsampling, you are STRETCHING the HARMONIC ENVELOPES of the notes…and by STRETCHING I mean DISTORTING. The distortion of all the TONAL HARMONICS (and possibly the complete loss of some of the harmonics) in the music are ruinous to the subtle but oh so tangible (at least to me) LAYERED ‘Richness’ & ‘Depth ‘ of the Sound. IF YOU CAN’T HEAR THE DIFFERENCE…WELL TO EACH THEIR OWN LISTENING EXPERIENCE..BLESSINGS BE

    • There are many misconceptions in your comments.


Leave a Reply

Your email address will not be published. Required fields are marked *