Dr. AIX's POSTS TECH TALK — 25 September 2014


The holiday record that I’m preparing for iTrax was recorded Shawn Murphy at 96 kHz/24-bits. He mixed the project to stereo at 88.2 and then downconverted to 44.1 for the CDs. The mastering was done on the 44.1 CD version. Then he mixed the project in 5.1 at 96 kHz/24-bits. It was also mastered and transferred to a hard drive. It’s been waiting for someone to offer it as a download and Jim Self, the producer of the project, found me. But I want to offer it at 96 kHz/24-bits in stereo as well…or at least at 88.2 kHz/24-bits.

Jim checked with Shawn and was told that there isn’t a high-resolution stereo mix. So I have to create one. The options are as follows: go back to the original Pro Tools multitrack masters and remix the project to stereo at 96 kHz/24-bits. This would open up a whole can of worms…matching the CD mix, artistic choices, additional costs, and the mastering would have to be redone. Another choice would be to upsample the 44.1 mastered CD into a high-resolution version. This happens more than you might imagine and despite what others may tell you it’s a completely useless exercise. The fidelity doesn’t change after the original recording. The third and final method would be to downmix the 5.1 masters into stereo. That’s the road we’ve decided to go down.

What is downmixing? It’s the process of taking a multichannel track and creating a stereo (or mono) version of the that track by reallocating the CENTER, LEFT and RIGHT SURROUND and LFE channels to either the LEFT or RIGHT FRONT channels. It’s a process that happens all the time in your optical disc player. When I prepare a project for commercial release, I have to include downmix “coefficients” for the Dolby encoded audio just in case the 5.1 mix has to be played back through a 2-channel stereo system. Once these coefficients are dialed into the metadata of the associated file, the player will automatically detect them and mix and distribute the audio channels according to these parameters.

The CENTER channel won’t have a center speaker in a stereo rig, so it get equally divided and sent to the LEFT and RIGHT FRONT speakers. But because it’s now coming out of two channels instead of a single speaker, the level has to be attenuated by 3 dB. The 3 dB drop is necessary to keep the perceived volume the same. This attenuation is built in to traditional PAN POTs on recording consoles as well…we want the same volume as we pan a signal between the left and right. The LEFT and RIGHT SURROUND channels are reduced by about 10 dB and combined in to the corresponding LEFT or RIGHT FRONT channel. And finally, the LFE is sent to the stereo channels and attenuated by a few dB. These parameters are adjusted to personal taste. The amount of information in the LEFT and RIGHT SURROUNDs does affect how much you want directed to the front channels but the numbers I’ve mentioned are pretty typical.

So that’s what I’m doing to the tracks of the “Tis The Season TUBA Jolly” project. The stereo mix will be exactly the same timbre as the 5.1 surround mix and maintain the same balance of instruments. And I get to hum along with my favorite Christmas music once again.

Forward this post to a friend and help us spread the word about HD-Audio Forward this post to a friend and help us spread the word about HD-Audio


About Author


Mark Waldrep, aka Dr. AIX, has been producing and engineering music for over 40 years. He learned electronics as a teenager from his HAM radio father while learning to play the guitar. Mark received the first doctorate in music composition from UCLA in 1986 for a "binaural" electronic music composition. Other advanced degrees include an MS in computer science, an MFA/MA in music, BM in music and a BA in art. As an engineer and producer, Mark has worked on projects for the Rolling Stones, 311, Tool, KISS, Blink 182, Blues Traveler, Britney Spears, the San Francisco Symphony, The Dover Quartet, Willie Nelson, Paul Williams, The Allman Brothers, Bad Company and many more. Dr. Waldrep has been an innovator when it comes to multimedia and music. He created the first enhanced CDs in the 90s, the first DVD-Videos released in the U.S., the first web-connected DVD, the first DVD-Audio title, the first music Blu-ray disc and the first 3D Music Album. Additionally, he launched the first High Definition Music Download site in 2007 called iTrax.com. A frequency speaker at audio events, author of numerous articles, Dr. Waldrep is currently writing a book on the production and reproduction of high-end music called, "High-End Audio: A Practical Guide to Production and Playback". The book should be completed in the fall of 2013.

(8) Readers Comments

  1. That is an awful lot of work, but I’m glad you take the time and effort. I’m sure it will be appreciated by all that hear it.

  2. A related issue is “how good are conversions?” If the original had been mastered at 88.2, I would naturally prefer it over it being converted to 96/24. This is a reason why I prefer 88.2 being the cutoff for HD-Audio; but I’m a follower of JAS now.

    This does matter consider Transparent’s release of Haydn: http://transparentrecordings.downloadsnow.net/haydn-in-america Does there comment make sense:

    “Provenance: Haydn in America was originally recorded to 8824 PCM (8824 is our short hand for 88.2kHz and 24-bit sampling). The 8824 WAV files are the original digital file generation sent to us. The DSF and FLAC files are considered second generation and made from conversions using our Blue Coast conversion methods. DSF and FLAC will offer the convenience of metadata that the WAV files will not.

    “After several blindfold tests, it is our opinion that the 8824 wav files sound the best, followed by DSF and after that the FLAC 8824. The difference is minimal. We suggest you purchase files for your best performing home DAC. The DAC will make more difference than the file type.”

    I purchase FLAC 8824, then converted them to AIFF.

    • I have no problem with 88.2 kHz/24-bit PCM as a capture and release format. The only reason to choose these specs is because the project is headed for a CD release….which is not important to me. I would prefer to capture at 96 kHz/24-bits and then release FLAC files with the metadata. This algorithm is lossless and is not a “second generation” copy. It is a losslessly encoded metatdata rich encode. You can always decode it back to AIF or WAV if you prefer.

      Downconverting to DSD in a DSF file format is a bad idea. You’d be throwing away information as compared to the PCM version…but there’s a sound to DSD that some people like. I concur with the statement that the DAC is more important than the format…as long as you stay with PCM.

  3. Mark, as always your engineering makes absolute complete sense. And Shawn Murphy is one of my favorite engineers, so this is a project I’ll be watching for.

    I’m glad you mentioned that in the downmixing process there’s some latitude for taste and interaction with each individual mix. However, I can’t imagine a time when downmixing the rear channels at -10dB would even be conceivable without substantially altering the effect of the total mix. Maybe that’s desirable from time to time, but not with anything I can think of! Several years ago I mixed a 5.1 master of classical music for video, and the mastering engineer applied the coefficients for me (thank you very much), reducing the rear channels by merely 3dB. The end result was a regrettably dry mix because ambiance and reverb were no longer in the “sweet spot” of balance. (Maybe I should’ve had them louder in the surround mix.) If 3dB could do that, I can only imagine that 10 would be akin to dumping them altogether! So I’m curious: where have you encountered rear channels mixed so aggressively to justify so much reduction?

    • In my experience, it depends on the individual tracks and the amount of musical material in the rear speakers. I’ve set up coefficients for downmixes of the Allman Brothers, Bad Company, and others (live concert DVDs) and the -10 dB works. It means that the audience cheering and applause doesn’t drown out the music.

      In the case of a studio recording like the Christmas project, Shawn prepared a “surround light” style 5.1 mix. The only thing in the read channels is room ambiance…no tubas. If I had moved those two channels into the front speakers the level of reverberation would be much too high. This was all digital reverb…not the actual sound of the room. I tried to match the sound of the stereo recording but using real 96 kHz/24-bits.

      • Thanks, Mark, that makes sense and is good to know!

  4. The process on how the music was recorded and mastered in the first place is what is confusing to me.
    Why is everything downgraded BEFORE the mastering and mixing is done?
    Why not just master and mix everything at the highest quality possible,
    then after that it can me downconverted to match the needs of CD and streaming codecs?
    I really don’t know much about these things, but I do try to follow what is happeningin the HRA industry.
    It just baffles me why so many people couldn’t care less about the fidelity and quality of their own music,
    as well as the studios and producers responsible for the process that makes the product from beginning to end.

    • I asked the very same question. Why did they capture all of the sessions at 96/24, mix and master the 5.1 version at 96/24 but follow a completely different signal path for the stereo CD. I can only imagine that the mastering engineer had a preferred piece of equipment or capability that wouldn’t work at full 96 kHz/24-bits or 88.2. Their procedures were surprising.

Leave a Reply

Your email address will not be published. Required fields are marked *

twelve + 19 =