As you can imagine, I get a lot of email. I always appreciate it when readers post their questions and comments on the RealHD-Audio.com site, but that hasn’t been convenient for the near 3000 recipients of the daily emails. So today, I’ve collected a few of the comments and questions.
Let’s start with an email that I received from my friend John Siau, the principal designer and all around digital guru at Benchmark Media. He was responding to my post on “Perfect Sound Forever” that much maligned marketing slogan that Sony and Phillips coined back with the introduction of the CD-101, the first compact disc machine (it’s not a coincidence that it was launched on October 1…10/1!).
John wrote to endorse the potential quality that a CD can deliver AND has always been able to deliver. It’s obviously not, as TAS editor Robert Harley wrote a “quaint notion that there’s no need for improvement over the CD”. Although I support high-resolution audio, CDs are not going to be obsolete anytime soon.
“I fully agree that the CD format is an excellent distribution format. Noise-shaped dither can extend the apparent dynamic range at least 12 dB, so we are not even limited to 96 dB. Anyone who doubts the effectiveness of noise-shaping should look at DSD. The 1-bit DSD format proves that noise shaping works!
Obviously the 16-bit format falls short in production applications. Longer word lengths (24-bit, 32-bits, or 64-bits) are now used in the studio to allow many cascaded and summed DSP operations without loss of SNR. Early DAW systems were limited to 16-bit storage, and were notorious for producing poor results. All newer DAWs are high-resolution systems and it is now possible to produce outstanding CDs if the conversion to 44.1/16 is the last step in the mastering process. Few recordings fully exploit the capabilities of the CD. Obviously there are some high-resolution recordings (such as those produced by AIX) that exceed the capabilities of the CD.
Nevertheless, it is still difficult to configure a playback system that is significantly better that CD-quality. Benchmark is addressing this issue with the new AHB2 high-resolution power amplifier. The AHB2 has a bandwidth that exceeds 200 kHz, and a SNR that approaches 130 dB.”
Another email from reader Paul S. tries to get at some related issues of PCM encoding:
“The dots in my scenario turn out to be XY coordinates of the digital samples of the music, with X being the moment in time and Y being amplitude. The dots must be connected to create the AC flow that we call music. I get that low pass filters and the mass of the drivers smooth off the digital edges caused by quantization. High sampling isn’t important in terms of frequencies that are too high to hear. It’s vital, if we want our highs we can hear to sound like the original in terms of amplitude and phase. Music doesn’t start and stop to give our sampling an easy time. Even 192k only starts to accommodate the accurate reconstruction of our audible highs no matter what phase they are in when sampled. With too few samples, like CD, there is not enough info to reconstruct the highs accurately. If you only have two samples of a 20khz sine wave, you better pray it is sampled at 90 degrees in. Otherwise, it will have lower amplitude and the peak will happen out of phase with the original wave. Your offer of 96k samples with 48khz frequencies suffers the same fate. Only the 90 degree sine wave will be acceptable.”
There are number of misunderstandings in his email. We’ve been going back and forth a little. First the discrete levels in the “battleship” grid of samples and amplitudes are not smoothed back into the continuously variable AC voltage that is sent to the amp and speakers by the “mass of the drivers”. It’s true that they instantaneous amplitude changes that occur when moving from one discrete amplitude level to another level (up or down) produce partials or overtones in the output. But according to the natural overtone series, the closest they can come to the fundamental frequency is a factor of two…because an octave has a numerical ratio of 1:2. Then a high quality LPF removes the objectionable HF partial and we’re back to the original waveform. If the highest frequency is 20 kHz then a sample rate of 44.1 kHz is more than enough to capture and reproduce it.
High frequency sampling is important for more than just “frequencies” that we can’t hear (although as I argued I think they do matter). The higher the sampling rate the easier it is to build a great filter AND the higher the frequency response. And they don’t have to be at 90 degrees to the sample times to ensure accurate capture and playback.
The Nyquist Theorem works whether the samples are “aligned” at the sample points or not. It assures us that an analog continuous audio waveform can be recreated accurately and in phase when the sampling rate is twice the highest frequency component. I used to understand this to mean “at least twice as high”…but the reality it that you only need two times that frequency for it work “perfectly.
This probably deserves another detailed post…but I thought I would pass my response to Paul back to the entire readership.