Audiophile Midterm Answers

October 2, 2014 Dr. AIX

Yesterday, my post was a mini test. It consisted of 10 questions…a few True and False questions and a bunch of multiple choice questions. The students at the university would love to see me prepare a test full of multiple choice questions…but I required them to write out their answer. Consider yourselves lucky.

Here’s the answers:

Question 1: The correct answer is letter “B” or 60 dB. Analog tape machines are not very good at capturing dynamics.

Question 2: None of the formats listed in this question qualify as high-resolution. DSD 64 comes the closest but because of the excessive noise that exists in the ultrasonic frequency range, I leave it out. The correct answer is letter “E”.

Question 3: False. Analog recordings don’t gain dynamic range just because they’ve been transferred to a large digital bit bucket.

Questions 4: The correct answer is 4-bits or letter “D”. The mastering process used on commercial release reduces the overall dynamic range of virtually all CDs to less than 10-12 bits or less than 4-bits.

Question 5: The answer is letter “A”. The Dolby Digital or AC-3 encoding scheme is a lossy algorithm and therefore doesn’t reproduce all of the fidelity that was contained in the original source recording.

Question 6: The correct answer is “D”. If the substance that you use on your optical discs cleans the surface and is completely removed from the surface after use, then it might result in better transmission of data from the “pits” of the disc to the optical pickup. The rest of the items on the list were actually taken from published statements about these products, from magazine reviews, or from customer testimonials.

Question 7: False. This notion is very widely believed by analogophiles. It’s patently false. Both statements are false. We don’t measure analog system with terms like resolution AND there are no stair steps in a PCM digital recording when it’s converted back to analog (in fact the data isn’t stair steps either).

Question 8: False: Moving from 16-bits to 24-bits lowers the noise floor and provides a safer margin from overages and distortion for recording engineers during sessions.

Question 9: The correct answer is “E” or all of the above. Dither is a very useful technique for removing or masking quantization noise and spreading it across the entire frequency spectrum.

Question 10: False. Although this statement has been widely used on website and promotional brochures that advocate for DSD, the fact is that DSD 64 recordings or SACD have about the same specifications and fidelity as a standard CD.

So how did you do? I’m expecting to receive some push back from some of you on several of these questions. So go ahead and let’s discuss.

As for grading…here’s how I would score things:

9-10 correct deserves and A
8 correct earns a B
7 gets a C
6 out of 10 lets you squeak through with a D
and anything less means you fail.

I hope you enjoyed the Audiophile Midterm. Keep studying and we’ll have the final exam in a few months.

DYNAMIC RANGE

Hello Mark,

I have been following your posts for a while now and found them very interesting and informative. I think we are all after the same thing, better sound. The issue remains how to get it, regarding both technical and marketing issues.

The main quibble I have with what you write is your use of the term dynamic range. Correct me if I am wrong; you define the dynamic range of any specific recording the difference between the maximum and minimum of the amplitude envelope during the time of the recording.

Since I was an audio equipment designer in the 1960s, I defined dynamic range as the difference between the loudest and softest tones the human ear can detect in a recording. For example, a person can hear a violin playing very softly in the presence of very loud, sustained percussion. I contend this definition is the one that is important in determining how many bits it takes to make a faithful recording. The soft violin with its even lower harmonics must be accurately reproduced even in the presence of other loud instruments.

When talking about types of recording and playback, I use the term signal-to-noise ratio to quantify the difference between the maximum signal before overload to the noise floor. The dynamic range can be somewhat greater if the minimum detectable signal is below the noise floor, the nature of the noise being random (Gaussian) and broadband. Perhaps I nitpick because these distinctions are important to instrumentation engineers like me. Depending on how low (or high) the noise floor of a recording is, I have found the ear can often hear instruments that are playing below the noise floor. A very fine playback system can spatially separate an instrument from the background noise.

Roger

29 thoughts on “Audiophile Midterm Answers”

stewart kiritz

October 2, 2014 at 12:01 pm

<>

I’m wondering about this assertion regarding SACDs. To me there is an enhancement. Are you saying this is just a result of my expectations and that SACDs do not have any greater fidelity than ordinary CDs?
- Admin
  
  October 2, 2014 at 12:12 pm
  
  Yep…SACDs are standard fidelity…not much better than CDs in terms of specs.
- Todd
  
  October 3, 2014 at 7:50 am
  
  Also, when an album is re-released as an SACD, it often is remastered in the process. So on the few SACD I own, I attribute their superior sound almost exclusively to the improved ADC and mastering choices of the engineer. And on the flip side I have listened to SACD that sound worse than the CD release, again personal preference in mastering.
lazy

October 2, 2014 at 12:37 pm

First, I repeat that this test is my favorite article you’ve posted yet, among many good articles. The questions revealed deep insight and thoughtfulness. Outstanding!

That stated, I only got #9 incorrect according to the answer key and as is my rep from my school years many years ago, I’m going to state my case for my answer 🙂

I selected “b”, and the answer key says “e”. I think what drove me to what I selected was the notion that noise-shaped dither (which is available in any competent sampling app these days) is specifically designed to avoid equally spreading digitization errors across the audible frequency range, and specifically designed to push such errors above the audible range (or at least to less audible areas per Fletcher-Munson curves) as much as possible. As a result of using noise-shaped dither, dynamic range issues can be avoided and in many cases dynamic range can be enhanced relative to not using dither across much of the audible frequency range when resampling.

But I quibble – this quiz really demonstrated your depth of knowledge of many of the keys to high quality audio. Great job!
Alex S

October 2, 2014 at 12:46 pm

I enjoyed this a lot. I’m ashamed to say I got a B, missing questions 5 and 8. Especially 8..after we just finished going over this….I’d like to have a test again in a few months.
Wayne Smith

October 2, 2014 at 12:48 pm

I’m trying to understand Q4. “The mastering process used on commercial release reduces the overall dynamic range of virtually all CDs to less than 10-12 bits or less than 4-bits.”
Was that meant to read to less than 10-12db or less than 4-bits?
- Admin
  
  October 2, 2014 at 6:56 pm
  
  You’re right…I mistyped
Dave Griffin

October 2, 2014 at 1:03 pm

This is the problem with the current education system, you pre-suppose that all answers would be from using analytical judgement. A modern student would simply search your history on the “web” to get all the “correct” answers and thus an “A”.
Kenrick Harry

October 2, 2014 at 1:15 pm

Professor,

My score was a “D”, 6 correct out of 10. I had a lot of fun doing the excersize and look forward to more of the same in the near future.
FB

October 2, 2014 at 2:08 pm

Got an A – was a busy bee :-))
Édouard Trépanier

October 2, 2014 at 2:19 pm

Thank you for the opportunity to qualify my “not so good” result. So, with all due respect Professor:

2. Yes DSD 64 will roll off high frequencies to cut off noise and this may or may not be good enough for HRA. We will see how HRA is defined on the market. However, the high frequency potential and mostly the dynamic range (including SNR) is better than CDs potential. This being said, I would have all my music UHRA (PCM 196/24) if I could.

4. I guess this one is a push back. The question is about capturing the music from the band. That is the « pick up » stage. Whatever goes out on the market, an audio engineer always seeks for the best dry master. If I had the chance to record « Babe I’m Gonna Leave You » with Led Zeppelin, I would want 24 bits: 18 for Mr. Page’s acoustic guitar, a few more bits for Mr. Bonham (son) and 4 more for the post-prod including dither.

6. Have you ever heard a well-protected CD (kept in its jewel case) sound better after some gel application? If a CD needs to be cleaned, I believe one does not need any “audiophile “gels” or other “enhancement” liquids” but only distilled water. Furthermore I would think that “pits” have to be badly dirty to cause the 0s and 1s not to be read correctly.
- Admin
  
  October 2, 2014 at 6:57 pm
  
  True…for the recording side. But the delivery side post mastering is where we lose it.
Bob Walters

October 2, 2014 at 4:47 pm

Because of the nature of sigma-delta converters, one cannot make a direct comparison between DSD and PCM. An approximation is possible, though, and would place DSD in some aspects comparable to a PCM format that has a bit depth of 20 bits and a sampling frequency of 96 kHz.[22] PCM sampled at 24 bits provides a (theoretical) additional 24 dB of dynamic range. – Wikipedia “DSD”

I have no axe to grind, but equating DSD with PCM44 seems a stretch. Maybe 24/88?

Bob
- Admin
  
  October 2, 2014 at 6:58 pm
  
  You might convince me that the dynamic range is improved with DSD over CD…but the fact remains that no recording are delivered with lots of dynamic range. The frequency thing, I’m holding tight. The high frequency shifted noise is present at 23 kHz.
SteveC

October 2, 2014 at 5:47 pm

I have a quibble regarding Q9. While a proper “flat” triangular dither will bring the measured dynamic range of a CD down to 93 dB, a noise-shaped dither will improve the perceived dynamic range (as in how the auditory system “hears” it) by some amount resulting in a dynamic range greater than 96 dB depending on the dither method–while at the same time worsening the measured dynamic range.

Also regarding the answer to Q9, you say “dither is a … technique for removing or masking…” “removing” and “masking” are different things. Dither does not “mask” quantization noise at all, but it certainly does remove quantization error. “Masking” implies that the error is still there but covered up. But really, a proper dither will remove the quantization error entirely at the expense of the added low-level (inaudible under standard listening conditions) noise. I’ve seen audiophiles argue that dither “simply masks” the quantization error by burying it in noise, but nope. It actually removes the error.
- Admin
  
  October 2, 2014 at 6:59 pm
  
  Uncle.
Ran

October 2, 2014 at 7:44 pm

Mark,

Regarding Q8. How do you define resolution? If we reduce the noise floor, don’t we gain “clearer” data? hence resolution?
Paolo

October 3, 2014 at 1:46 am

Can you please explain a bit more no. 8? For me it’s counterintuitive that increasing the number of available bits does not improve the fidelity of the recording. Great post anyway.
- Admin
  
  October 3, 2014 at 5:34 am
  
  The benefit from increasing the number of bits in a PCM digital word is a lower noise or put another way an increase in the potential dynamic range. I have authored several posts and my friend John Siau from Benchmark Media has provided great insight into this topic as well. Click here to read my post.
Roger Jones

October 3, 2014 at 12:26 pm

DYNAMIC RANGE

Hello Mark,

I have been following your posts for a while now and found them very interesting and informative. I think we are all after the same thing, better sound. The issue remains how to get it, regarding both technical and marketing issues.

The main quibble I have with what you write is your use of the term dynamic range. Correct me if I am wrong; you define the dynamic range of any specific recording the difference between the maximum and minimum of the amplitude envelope during the time of the recording.

Since I was an audio equipment designer in the 1960s, I defined dynamic range as the difference between the loudest and softest tones the human ear can detect in a recording. For example, a person can hear a violin playing very softly in the presence of very loud, sustained percussion. I contend this definition is the one that is important in determining how many bits it takes to make a faithful recording. The soft violin with its even lower harmonics must be accurately reproduced even in the presence of other loud instruments.

When talking about types of recording and playback, I use the term signal-to-noise ratio to quantify the difference between the maximum signal before overload to the noise floor. The dynamic range can be somewhat greater if the minimum detectable signal is below the noise floor, the nature of the noise being random (Gaussian) and broadband. Perhaps I nitpick because these distinctions are important to instrumentation engineers like me. Depending on how low (or high) the noise floor of a recording is, I have found the ear can often hear instruments that are playing below the noise floor. A very fine playback system can spatially separate an instrument from the background noise.

Roger
- Admin
  
  October 4, 2014 at 5:23 am
  
  Interesting Roger. I’m running this morning but will get back to this comment soon.
- lazy
  
  October 4, 2014 at 1:26 pm
  
  I’ll be interested in Mark’s thoughts on this of course, but I think you bring up several insightful points, Roger. In the area of music measurement, I don’t believe there is a universally accepted definition for “dynamic range” yet, so I would recommend anyone using such a term or similar to define when used. I also agree (or perhaps prefer) a dynamic range definition for music that reflects audibility impacts instead of just signal amplitude. I believe that’s consistent with perhaps the best source of international standards on the concepts of loudness and the like – ITU BS.1770, which includes EBU R128 measures for “peak”, “loudness” and and “loudness range” (aka LRA). There is no BS 1770 measure (yet) for dynamic range, but a few credible people use something called “PLR” (peak to loudness ratio), which is essentially a crest factor measurement that uses the BS 1770 definitions for peak and loudness, and is thus audibility based instead of just signal amplitude-based like TT DR.
  
  Mark, hopefully you recall I pontificated in a similar manner when wondering if or how your upcoming HRA database would define and measure dynamic range-related info that, at least for me, is perhaps the most critical factor in deciding to by HRA versions of CD quality music I already own. I realize that any measure that can’t be efficiently performed for a large number of songs is not very practical, so there are tradeoffs to consdier in this area. Fascinating topic.
  - Admin
    
    October 4, 2014 at 1:38 pm
    
    Thanks for the input…I can see that I’m going to have to look into this more.
- lazy
  
  October 4, 2014 at 1:41 pm
  
  Separate from dynamic range definitions, my understanding of “noise floor” and the audibility of sounds below the “noise floor” is different than yours Roger, but I’m no expert in this area. In the past when friends had claimed that they could hear music below the noise floor, I was able to trace down the cause of that to different definitions of “noise floor”. More specifically, many people tend to define “noise floor” as an A-weighted or RMS or similar spectrum-wide measure, and I define “noise floor” as a plot of frequency and decibels. In every case I encountered when someone claimed that they could hear music below the noise floor, what they really meant was that the frequency content of what they were hearing was above the noise level at those same frequencies, even though the A-weighted or other spectrum-wide measure showed a higher dB than the music sounds in question. In short, I don’t believe anyone can discern music content below the noise level at that same frequency. It gets tricky when you might hear harmonics of a fundamental tone in music, such that the frequency profile of the noise floor at the fundamental frequency is higher for the fundamental, but below for the harmonic(s), so one could hear the harmonic content above the noise floor for the harmonic frequency(ies).
  - Admin
    
    October 4, 2014 at 2:26 pm
    
    I’m with you on this…no one can perceive or hear music elements that are lower than the noise floor.
Andrea

October 3, 2014 at 5:37 pm

6. If you accept that some of these products can improve reading of the pits, then all are true. If the laser sled has less hunting to do, then its servo is producing less electrical noise. It may also reduce jitter. In a one box CD player, both have the potential to improve the sound. That’s the theory, anyway this comment should not be taken as advocating such tweaks.

8. I’m still waiting to read your definition of resolution that causes this question to be false. Every reference I’ve been able to find – whether related to audio or analog to digital conversion in general – is consistent about more bits equaling greater resolution. Being able to discriminate smaller fluctuations within the same range – e.g. -10 to +10 V – is consistent with the usual description of resolution in a measurement system. (Noise shaping is an free. Increasing resolution in one frequency range costs you resolution elsewhere. Adding dither has a number of benefits, but it doesn’t increase resolution either.)

Your definition of HRA is different than that of the CEA and DEG, but we can still have intelligent conversations about it, because you have clearly defined what you mean by it. If you are going to use a nonstandard definition of resolution, you should define it equally clearly. Until you do, I think you’ll continue to get pushback on this assertion.
- Admin
  
  October 4, 2014 at 5:30 am
  
  I would point you to John Siau’s paper at the Benchmark website. The additional of more bits doesn’t increase the resolution of a sampling system, the lower the noise floor. This is different than having more amplitude levels within the same amplitude range.
  
  I’m currently traveling in Grand Rapids, Michigan for the weekend…and have limited time.
Andrea

October 4, 2014 at 11:32 am

Increasing the number of bits does give you the ability to record more amplitude levels in the same amplitude range within a given bandwidth. If you think this isn’t true, I suggest you go back to the beginning on learning how analog-to-digital converters work. In many cases, the least significant bits are just recording random fluctuations in the analog noise floor of the system. That is still an increase in resolution of the digital system.

I find Mr. Siau’s white papers and application notes to be generally well-written and informative. I would also put more faith in what he writes than in what is put out by other audio designers. However, in the article to which you are referring, he doesn’t define resolution either, and cites no references. That makes the article far from authoritative.

Reading through the comments over all of your related posts, I see I am not the only person with a technical background who has trouble with your statement Perhaps if you, or he, can be more specific as to why you believe the ability to track smaller signals within the same overall amplitude range does not constitute higher resolution, we’d find that we all agree.
- Admin
  
  October 4, 2014 at 12:45 pm
  
  I’ll think we’re discussing different ideas here. I trust John on this point that increasing the number of bits doesn’t give you additional resolution…at least how we define it.

Dr. AIX

29 thoughts on “Audiophile Midterm Answers”

Leave a Reply Cancel reply