John Siau is one of the principals and Director Of Engineering at Benchmark Media, makers of both professional and consumer audio equipment. He has almost 40 years of experience designing high-end analog and digital circuits for use in a variety of applications. One of his recent projects was the upgrade to the Benchmark DAC1 HDR, which is now the DAC2 HGC, which includes DSD conversion. I spoke to him about DSD technology after reading a few of the white papers on the topic. The illustrations below were taken from a paper by Andreas Koch, a principal in Playback Designs (a hardware company that manufacturers DACs and advocates for the DSD format) from an article entitled, “DSD – the new Addiction”.They got me thinking about the DSD format.
Transcription of John Siau Interview April 3, 2013
MW: So let’s start with the basics of conversion technology shown in Figure 1 of the DSD document. What’s going on there?
JS: The block diagram shows a 1-bit oversampled ADC feeding PCM data to a 1-bit DAC. This topology was typical in the 1990′s but does not apply to most PCM converters manufactured after about the year 2000. Virtually all of today’s PCM converters use oversampled 4-bit conversion. Oversampled 1-bit converters are a relic from the past. The additional 3 bits improve the SNR of the overall conversion system while greatly reducing the amount of noise shaping that is required.
Figure 1 – A block diagram of the conversion from analog to digital in PCM and DSD (Click to enlarge)
MW: Explain the change to multi-bit oversampled converters
It was the move to 24-bit conversion that ultimately made the 1-bit delta-sigma converter obsolete.
JS: In the 90s the music industry began to mix and process audio in digital audio workstations (DAWs). The first DAWs used 16-bit processing, and recording engineers quickly recognized that this 16-bit processing introduced noise and distortion. Manufacturers responded with 18, 20, 22, and 24-bit systems. It was the move to 24-bit conversion that ultimately made the 1-bit delta-sigma converter obsolete. Benchmark’s first step beyond 1-bit conversion delta-sigma was the DAC2004, a 20-bit DAC that was introduced in 1997. It had two 1-bit delta-sigma DACs wired in parallel. This configuration produced a 3 dB improvement over existing systems. With 4-bit delta-sigma converters we can now achieve a 130 dB SNR. This is a full 10 dB better than the channel capacity of 64x DSD. A 1-bit system simply doesn’t have enough room in the format for both dither noise and the audio signal. DSD is limited to a 120 dB SNR over the audio band. You can pass an audio signal that’s partially dithered or an audio signal that has no dither but there’s not enough room to pass a fully dithered audio signal. You need more than 1-bit in order to be able to do that.
A 1-bit system simply doesn’t have enough room in the format for both dither noise and the audio signal.
MW: OK, so DSD may have a reduced SNR, but the simplicity of the data path must offer some sonic advantages.
JS: The dotted lines in figure 1 show how DSD can create a bypass path to eliminate several DSP blocks in the ADC and DAC (but this bypass assumes 1-bit conversion at each end). If DSD were designed today, we would probably consider using 4-bits instead of 1. The problem is that 1-bit DSD is nearly impossible to process. In the studio, DSD is processed as 8-bit data (or wider) at DSD sample rates, in a format known as DSD-wide. The additional DSD-wide data bits reduce the amount of noise shaping that must be applied in each processing step. Bottom line, the 1-bit DSD bypass shown in the diagram doesn’t really exist except in the very simplest direct-to-disk DSD recording.
MW: DSDs are not mixed in DSD?
JS: No, DSD signals are not 1-bit wide in the DAW. DSD becomes 4-bits wide or 8-bits wide or 16-bits wide depending on what word length the DAW can handle.
MW: Really. So is it true to say as a complement to what they say in here that every DSD file or project actually goes through a PCM stage as well?
JS: Shh! Please don’t use the PCM word! But OK yes, it is probably safe to say for about 99% of the DSD projects that have been done. There may be a few exceptions where somebody set up a microphone to a DSD direct to disc project and have done nothing with it. No processing, no level changes, no filtering, no mixing or have used that for archiving an analog tape…or archiving a vinyl disc.
MW: Those are rare.
JS: Yes. We’re going to do multichannel recording, we’re going to have to do mixing, we’re going to have to do EQ, we’re going to have to do various effects and all that has to be done in PCM. Now the PCM can all be done at DSD sample rates, and that’s fine. That’s well and good. The conversion from DSD to PCM is really an automatic thing, a very benign conversion. If you don’t change the sample rate, there is no loss of quality when expanding the word length. As soon as you do any mathematical operation on the DSD, the word length expands. And so DSD goes from being 1-bit to being multiple bits. The DSD DAW manufacturer chooses how many bits of precision they wish to preserve in the processing – the more the better.
MW: Is that an example of what Pyramix or the Sonoma system is doing? I mean productions that are actually happening in that domain, are they doing all these PCM steps and just keeping it under the covers?
JS: That’s exactly what they’re doing. Yeah. It’s PCM at the DSD sample rate. But, nothing bad happens up until that point. The loss of quality only comes when multi-bit PCM is dithered back down to 1-bit DSD. When you dither down to 1-bit, you’re adding huge amounts of quantization noise. Any 1-bit DSD signal has a 6 dB signal to noise ratio – at best (when the ultrasonic noise is included in the measurement). The noise situation gets worse when we have two cascaded multi-bit to 1-bit conversions (once in the ADC, and once in the DAW).
JS: And so you have to apply very, very aggressive noise shaping to keep the in band noise down but this comes at the expense of a tremendous amount of out of band noise.
MW: Yeah, that’s the purple haze you see at the high end of the spectrum in DSD recordings.
JS: The spectrum analysis of DSD shows a huge amount of noise at high frequencies. You’ve got 6 dB of signal to noise ratio – at best.
…if you measure the signal to noise ratio of that whole wide band signal before you go through the analog low-pass filter, which they like to conveniently ignore, but it’s there in all the DSD systems. You’ve got 6 dB of signal to noise ratio.
MW: Back up on that a second. You mean the end result, if you include the ultra sonic noise maxes out at 6 dB signal to noise ratio?
JS: Yep, it can’t be any better than that.
MW: And so what a mastering engineer or an equipment manufacturer would do is simply would apply a low pass filter to remove the “purple haze” out of the equation.
JS: Unfortunately, the ultrasonic noise cannot be removed in mastering unless the DSD source is being transferred to PCM. The ultrasonic noise is always present in DSD signal, it cannot be removed until the DSD signal is converted to analog or to PCM. This means that the noise must be removed in the playback hardware. If the DSD DAC is equipped with a well-designed analog low-pass filter, we can achieve signal to noise ratios that start to rival some of the better PCM systems. DSD doesn’t approach the 144 dB SNR performance of a 24-bit system , but it certainly exceeds the -96 dB SNR performance of the CD format. With a well-designed filter, DSD can achieve a 120 dB signal to noise ratio, roughly equivalent to a 20-bit PCM system.
MW: They talk about making 20-20 kHz just stellar and they don’t really worry about anything higher than that because of this whole noise-shaping dilemma.
JS: Yeah. The problem is that that DSD marketing materials often show a nice, well-formed high frequency square wave. But, this waveform only exists before the analog low pass filter. It looks very different after the analog low pass filter. To his credit, Andreas Koch didn’t show the square wave in his paper but it’s something that does appear in many DSD marketing materials.
MW: Yeah. I’ve seen it in the standard Sony DSD white paper.
JS: So as far as that Figure 1 is concerned, the conversion from DSD to PCM is a very benign conversion. The conversion from PCM back to DSD is where all the problems occur. If you can avoid ever going back to 1-bit, you’re much better off. For this reason, all modern DACs avoid dithering all the way down to 1-bit. They usually stop at 4-bits. The modulators will modulate down to 4-bits and not to 1-bit, so the noise shaping doesn’t have to be nearly so aggressive. With 4-bits, there is also adequate space for the required dither.
MW: Why would they continue to make the claim that the bandwidth, just like in analog systems, goes up to 100 kHz?
JS: Well, it does before the analog low pass filter. Unfortunately the low pass filter is an absolute necessity.
MW: Because of all the noise that’s been shifted up there, right?
A DSD spectragraph without the low pass filter to remove the HF noise. Notice the butterfly plot on the right. Click to enlarge.
JS: They like to conveniently ignore the fact that a 50 kHz low pass filter is required in any practical DSD system and that it is, in fact, a requirement of the specification.
MW: Is it really?
JS: Yeah. The high-frequency noise is a disaster if it reaches power amplifiers and speakers. 128x DSD offers some improvements which allow expanding the usable bandwidth above the 50 kHz limit of 64x DSD.
MW: But then the files are going to get huge.
JS: Yes, but file size is less of an issue these days. In my opinion, DSD and PCM are both good distribution formats. They’re both perfectly adequate for distributing the final product to the consumer. PCM is a little bit easier for the consumer to work with and PCM simplifies the playback hardware. It’s a lot easier to do PCM volume control. It’s a lot easier to do soft fades, crossfades, or any other processing that is required for playback. All of these processing functions are far easier to do with a PCM source than with a DSD source. But, if you put the processing issues aside, DSD is adequate for conveying the entire signal to noise ratio and bandwidth captured in any of today’s best recordings.
MW: Is that inclusive of any of the things that I do? I’ve got Wallace Roney and the spectragraphs that I look at that and other things that we’ve done exceed 40-45 kHz. And my justification is that I don’t really care whether the speakers and the rest of the hardware can actually reproduce the increased fidelity. But if there was a musical sound in the room when they were performing, I want to be able to capture it and preserve it through the entire production chain. And because Blu-ray and DVD-Audio can deliver frequencies higher than the traditional human limits, I say let’s try to reproduce everything. Given the situation with SA-CD and DSD and this whole noise shaping thing, it doesn’t sound like that is an option for them.
JS: Right. These frequencies are above the playback capability of DSD. Remember, you’ve got a 50 kHz low pass filter that means you haven’t got a chance for accurately reproducing anything over about 47 kHz in DSD. The filter introduces phase distortion, amplitude errors, and ringing as we approach the 50 kHz cut-off frequency. In contrast, 96 kHz PCM will capture your ultrasonics just fine.
MW: Yeah. That’s what I use when I’m recording.
JS: And we’re not very good at capturing anything that’s much above 48 kHz at this point.
…look at Figure 2, that is a very, very misleading figure cause it’s really…what you have is an FFT of the DSD and you have a straight line that’s drawn based on 6.02 dB per bit at approximately 144 dB representing the 24-bit PCM. That’s not what the 24-bit PCM will look like on an FFT.
JS: If you look at the paper here and look at Figure 2, that is a very, very misleading figure cause it’s really…what you have is an FFT of the DSD and you have a straight line that’s drawn based on 6.02 dB per bit at approximately 144 dB representing the 24-bit PCM. That’s not what the 24-bit PCM will look like on an FFT.
Figure 2 – A DSD 64 FFT plot vs. a PCM (96/192/384 kHz) line chart. Click to Enlarge.
MW: What would be different?
JS: Oh, it will be way quieter, way lower level than that DSD that you see there. You gotta be comparing FFTs to FFTs, not a line that’s drawn on there based on a calculation that’s not valid in this case. I have been rather outspoken about this from time to time. I have a product that supports DSD playback and I support DSD playback because…the theory being that if you have DSD material that you want to playback I want to give you a way to do it. It shouldn’t be absolutely necessary to convert it first. If you want to play it back directly we’ll give you a way to do that.
We do not recommend it at all for any kind of studio production work. It’s just completely unsuitable for professional applications…for any production work.
JS: We do not recommend it at all for any kind of studio production work. It’s just completely unsuitable for professional applications…for any production work. The only way it should exist, if it exists at all, should be as the final output from a mastering room, where for whatever reason we want to distribute this in a DSD format. Okay, let’s create a master in a DSD format that we can distribute.
MW: So what goes on earlier in the production process could be analog tape or anything else including PCM.
JS: Right. And PCM is a wonderful format to do all your production work. Do all your mixing, all your EQ, all your processing that you’re going to do, everything that you’re going to do in the mastering process and then the very final output can be DSD. There will be some loss in quality when you do the PCM to DSD conversion, but this loss is just because of the limitations of the DSD and not due to any limitations of the conversion process. You’ll have a better result doing that than trying to do all the processing in DSD.
JS: We make what a lot of people have called the best D to A converter for DSD playback. They’re thrilled with the way our converter sounds for DSD playback. We worked hard to make sure that we did it all natively but DSD is not a format that I think is a great idea. It’s not.
MW: But it’s out there.
JS: It’s out there and we want to support it because it’s out there, but not because we want to encourage the proliferation of it.
MW: How do you handle volume control in that final output stage? Do you convert to analog and then turn it up and down.
JS: We actually don’t. We do process that at the high sample rate and we have multiple 1-bit converters that are available to us. So the increase in word length that we get as a function of that volume control makes use of the redundant 1-bit converters that we have running in parallel.
MW: I see.
JS: So we’re not converting it…in a way you could look at that as if it’s PCM because there’s multiple 1-bit converters summed together in the analog domain. But that’s what you have to do to get volume control to work. The good thing is we don’t take it from 1-bit to multi-bit and back to 1-bit before we convert it to analog.
MW: Yep, as you were saying before.
JS: Instead of sending identical DSD signals to sixteen balanced 1-bit converters that are wired in parallel, we start sending different DSD signals to reduce the signal amplitude. All summing occurs in the analog domain. It is very cool!
MW: There are a lot of varying positions on the validity of DSD and I appreciate your frank assessment and experienced input.
JS: And Stanley Lipshitz and John Vandercoy did a lot of work on this. They wrote a lot of papers and a lot of it fell on deaf ears.
MW: I’ve read those and actually met Stanley at an AES meeting in the UK some years back.
JS: I actually had a SONY engineer say to me one time and this is quite few years ago…he said, “we realized after we got a ways down the road that DSD was kind of a mistake but we had too much invested in it”.
I actually had a SONY engineer say to me one time and this is quite few years ago…he said, ‘we realized after we got a ways down the road that DSD was kind of a mistake but we had too much invested in it’.
MW: Wasn’t archiving their whole reason for coming up with it in the first place? It was going to be used to take their analog masters in their vault and putting in a format that they thought would preserve the most fidelity, right?
JS: Yeah. And conceptually it looked like a simple approach. And, DSD significantly outperformed the 16-bit PCM systems that were common at the time. As a distribution format, DSD is definitely a big step above 44/16 CDs, and we want to give people the best possible playback of the wonderful DSD recordings that already exist.
MW: And they tried to put in the successor to the CD and that’s where we got a format war.
JS: Yep. Moving forward, we should focus on 24/96, and 24/192 downloads as these formats offer the best quality available.
Moving forward, we should focus on 24/96, and 24/192 downloads as these formats offer the best quality available.
I would like to thank John Siau for sharing his expertise on this topic. Using DSD 64/128 for production work is clearly not a viable option for high-end music and it is doubtful that moving forward with DSD for downloading will have any benefit for music lovers. In fact, it may just confuse things all the more.