The NPR “How Well Can You Hear Audio Quality?” quiz is seriously flawed. The idea of comparing two levels of MP3 encoding against an “uncompressed” WAV has merit. However, the choice of material and the implementation of the test wasn’t done as well as it could have been. Yesterday, I introduced the NPR evaluations (click here to read the first post) and today I’m going to provide some analysis of a couple more of the 5 tracks. I’ll complete the process over the next couple of days.
I covered the Murray Perahia classical selection yesterday. Next up was the Coldplay segment. Their smash single “Speed of Sound” is an example of a highly processed, dynamically flat commercial rock tune. The mastering engineer did a real number of this track making it practically devoid of sonic interest…at least for me.
Figure 1 – Coldplay from the NPR listening evaluation.
Here’s the spectrogram of the three presentations of the track:
Figure 2 – The Coldplay “Speed of Sound” spectra from the NPR test. [Click to enlarge]
There are number of things to notice about this illustration. The first thing we notice is the absolute lack of dynamic range. The tune fades up and stays pretty flat. That means the mastering engineering used a lot of dynamic compression…even limiting. It is interesting to note that the lowest fidelity file has the most dynamic changes AND has the loudest peaks. To properly evaluate this track, the dynamic profiles should be identical. The people that prepared this file did a poor job of ensuring that the only factor being compared is the format.
It’s clear that the MP3 at 128 kbps has no frequencies above 16 kHz while the WAV files and the MP3 at 320 have almost identical spectrograms. I don’t what MP3 encoder the folks at NPR used but it created artifacts (the think red lines above 16 kHz) above 16 kHz in the 320 example. As you’ll see below, a good MP3 encoder will not introduce this type of artifact.
You’ll note that I’ve captured all of these tracks at 96 kHz/24-bit PCM. The right-hand spectral plot shows the low level noise above 22 kHz. I’ve zoomed in to the spectra on the left to highlight the differences.
I also spent some time yesterday assembling a chart of the dynamic and loudness parameters of these tracks. Here’s the spreadsheet for the Coldplay “Speed of Sound” excerpt:
Figure 3 – The dynamics and loudness information about the Coldplay example.
The Coldplay excerpt has the loudness peaks, clipped samples, and the lowest RMS value (less amplitude compression). So when you’re listening, this one could easily fool you.
Next up during my trial was the Neil Young song excerpt. I got this one correct. The paragraph in the gray box talks about Pono, Neil’s “high-quality” download music service (it’s actually called PonoMusic…the device is called Pono). The clip, “There’s A World” from the 1972 album “Harvest” used on the NPR test, comes preloaded on the Pono Player.
Figure 4 – Neil Young’s “There’s A World” information on the NPR site.
Then comes a line that’s generous to say the least, “Many of the songs in the Pono store are offered as CD-Quality downloads, but much of Young’s catalog is available at the highest possible quality, as super-hi-def 24bit/192 kHz FLAC files”. You got to love the lack of research and the new superlative terminology invented by the writers at NPR.
NPR may imagine that 99.9% translates to “many”. I would use something more like “virtually all of the content on the PonoMusic website is CD spec”, which I feel gives a more accurate comparison of “so-called high-res” vs. CD spec tracks. That Neil’s 1972 masterpiece “Harvest” gets elevated to “super-hi-def” status because of the delivery container’s specification is also off the mark. Strike two for the authors.
Here are the spectra of the three versions of “There’s A World”:
Figure 5 – Spectra of the three versions of “There’s A World” by Neil Young. [Click to enlarge]
What do we see in this illustration? There is LOT more dynamic range. The time vs. amplitude light green plot at the top left of the page shows the traditional timeline and waveform. There are quiet sections and loud sections. Bravo to the Elliot Mazer (the engineer), Neil, and the record company for giving us an album with some of the fidelity left in.
The spectra show CD spec or standard resolution for the uncompressed WAV. The 320 kbps version is very close to the WAV with the 128 MP3 topping out at 16 kHz once again. I have to question the encoding tools or techniques used by the NPR engineers because when I did the conversion to 320 kbps MP3, the resultant files top out at 16 kHz. There isn’t anything higher than that. Here’s the plot that I got:
Figure 6 – Spectra of “There’s A World” converted to 320 mbps MP3 from 192 kHz/24-bit FLAC. [Click to enlarge]
I have the “Harvest” album in “super-hi-def 24bit/192 kHz FLAC”. I converted the same section of “There’s A World” to a 320 kbps MP3 file using Adobe Audition…and I don’t get the same high-frequency extension that the NPR guys got. I don’t know what tools they used. I’m going to write to them when I finish this series of posts and see what I can find out.
If the test isn’t prepared correctly, then the results are going to unreliable. So far, I’m not feeling too confident in their selections or procedures.
Finally, here’s the dynamic analysis of the Neil Young tune:
Figure 7 – The dynamics and loudness report on “There’s A World” from the NPR quiz.
Once again, the loudest peak comes from the 128 kbps MP3 file. The uncompressed WAV has the lowest RMS loudness and the widest dynamic range. The ITU-R BS.1170-2 Loudness reading is a very good -20 LUFS! Compare that to the -10 associated with the Coldplay excerpt.
More tunes tomorrow.