Resolution And The Pixel Analogy

24 thoughts on “Resolution And The Pixel Analogy”

Chris Wright

September 1, 2015 at 7:05 pm
Permalink

Your later comments in this piece should be reassuring to those of us with large CD collections Mark. Contrary to the hipster trendiness of vinyl, I’m more in love with my CDs than ever. Now that the bugs have been ironed out of digital, I’m still learning how important the rest of the sound chain is to getting CD sounding at its best – a quest I’ve been on since 1985, one way or another.

The big lesson is that we often blame CD for other weak links. Another is that system synergy is way more important than spending mega bucks too. And when that stuff is right you don’t need to stress over things like cables either.

For those of us who most value legacy recordings from the 70s backwards, I truly be.ieve that, when done right, the CD is pretty much as good as it gets.
Reply
- Admin
  
  September 2, 2015 at 7:22 am
  Permalink
  
  Chris, CDs can sound absolutely great…and actually much better than tape or vinyl LPs in absolute terms of fidelity. But people will choose the recordings they like based on their own sonic preferences.
  Reply
  - Chris Wright
    
    September 3, 2015 at 7:08 pm
    Permalink
    
    Hi Mark, yes of course I agree people will choose on preferences. However I also think a lot of people are tending to believe that CD is a flawed format, which I think we would both agree it absolutely isn’t.
    Reply
    - Admin
      
      September 4, 2015 at 5:53 pm
      Permalink
      
      CDs are not flawed…in fact, they can be fabulous if done with care.
      Reply
Grant

September 1, 2015 at 9:16 pm
Permalink

I think this analogy is quite poor on a number of levels. (sorry, pun)

– increasing audio bit depth is like using bigger pixels in a camera, not more pixels. You can get more signal into each piece of information (byte).

– cameras don’t have a DAC process, so you are literally looking at the bytes, whereas digital audio recreates the analog signal and is fully analog when you experience it. There are no steps or chunks or joints in the user experience of digital audio.

– when you write ‘colors’ you should be writing ‘shades’ or ‘tones’ for the sake of the analogy. Audio has no equivalent of the way colours are compiled and filtered in a digital camera.

– “Moving from 16 to 24-bits in audio allows the system to identify more unique amplitude levels.” This quote is not part of the camera analogy, nor is it untrue, but you are perpetuating an anti-digital myth with this statement. Remember that the analog waveform post-DAC from a 16 bit waveform is an *exact* replica of the input waveform up to half the sampling frequency, plus noise. Therefore, all the ‘more unique amplitude levels’ in the 24-bit process are all lying below the noise floor of the 16-bit process, once you look at the final analog output. That’s an important qualifier to your statement, especially for playback purposes.
Reply
- Admin
  
  September 2, 2015 at 7:27 am
  Permalink
  
  Grant…I’m not with you on this. It’s not bigger pixels that you get with more bits, you get more discrete values for each pixel. As for the DAC, you’re right there isn’t a DAC per se, but there is a monitor that converts the digital information into something that can be viewed…and I see those as similar.
  
  The discrete values can be shades or colors. The point of the article is that using images and photographic resolution doesn’t get us to a better understanding of digital audio resolution.
  
  The increase from 16 to 24-bits does provide more discrete levels to use in the digitization process. The practical result is a lower noise floor…more dynamic range. They don’t lie below the noise floor…they establish it.
  Reply
  - Jay
    
    September 2, 2015 at 4:40 pm
    Permalink
    
    Actually, monitor doesn’t convert, it’s done by Video DAC inside a graphics card.
    Reply
    - Admin
      
      September 2, 2015 at 6:19 pm
      Permalink
      
      OK, then the video monitor is the amplifier and speaker?
      Reply
  - Grant
    
    September 2, 2015 at 6:58 pm
    Permalink
    
    I’m happy to be wrong, but I think you might be ‘with me on this’ soon, Mark, heh heh.
    
    16 pixels of a light sensor are not analogous to a 16-bit byte of audio (and 24 pixels a 24-bit byte of audio), because a pixel is not an on-off state, it is an analog bucket that fills with light. The bucket has an electrical noise level fixed by technology, and the max signal is fixed by the size of the opening on the top of the bucket, i.e. the pixel area, where photons pour in. Therefore, when you increase the pixel area (size), you increase the available SNR, which is analogous to more bits in an audio byte.
    
    I agree that 24 bits provides more discrete levels to use in the *digitization* process, but the point you want to make isn’t about capturing the sound in the studio with 16-bit technology (I mean, who does that anyway?), it is about packaging the media for distribution in 16 bits or 24 bits. The question becomes, what form do the ‘more unique amplitude levels’ take, when we look at the dithered post-DAC analog signal of a 16/96 download vs a 24/96 download? And the answer is that both have the same peak level, both have perfectly smooth analog waves with all the original information below 48 kHz and above -93 dB, but the 24-bit signal has further musical information below -93 dB and all the way down to -141 dB. So that is why I wrote that “all the ‘more unique amplitude levels’ in the 24-bit process are all lying below the noise floor of the 16-bit process”.
    Reply
Sal

September 1, 2015 at 10:27 pm
Permalink

What a shame it’s always been that those who make such decisions serve up the biggest bunch of crap to the most popular selling styles of music. Rock and country has supported the music industry (and high end audio) for the last 50 some years yet has always been treated like redheaded step children when it comes to giving them quality.
Reply
- Sal
  
  September 2, 2015 at 2:49 pm
  Permalink
  
  Interesting how fast things changed in a decade.
  Arron Tippin’s – Read Between The Lines, 1992, RCA Nashville, DR 14 DR Max 15 Pretty Darn Good
  Arron Tippin’s – People Like Us, 2000, Lyric Street Records, DR 8 DR Max 9 SAD
  Reply
Dave Griffin

September 2, 2015 at 3:10 am
Permalink

Mark wrote: “You may have already heard this one. It goes something like this. Imagine a photograph taken with a camera that has about 65,000 pixels because the camera that took the image has “16-bits of resolution”. If you switch to a camera that has 24-bit resolution, then the number of pixels will dramatically increase to almost 17,000,000 pixels. The realism and detail of the new “higher resolution” image will be much better than the previous one, right? In the world of digital photography…yes, it will, if all you’re doing is using 16 or 24 to generate a number of pixels.”

There’s confusion here between resolution and colour depth: resolution (the number of pixels) is independent of colour depth; 65000 pixels at 16-bit will still be 65000 pixels at 24-bit, only the number of colours per pixel will change. This is accurately mentioned later on, but the above paragraph would be confusing to the uninitiated.
Reply
- Admin
  
  September 2, 2015 at 7:28 am
  Permalink
  
  Your point is exactly why I wrote the piece. The person that pitches the pixel analogy confuses the number of pixels with the bit depth behind each pixel.
  Reply
- Jay
  
  September 2, 2015 at 5:15 pm
  Permalink
  
  Video resolution is determined by the frame rate, not the number of pixels, which is just zoom/pan. Video frame rate is exactly the same as audio [up/over]sampling rate!
  Reply
  - Admin
    
    September 2, 2015 at 6:20 pm
    Permalink
    
    I don’t think you’re going to get a lot of agreement on your assertion that video resolution is determined by the frame rate. When the CE companies pitch increased resolution of their new televisions, they brag about moving to 4K not 48 fps.
    Reply
    - Jay
      
      September 3, 2015 at 8:54 am
      Permalink
      
      And that’s the whole misguide as the very difference between HDTV and DVD-Video is not in that 1080 is bigger than 576, but it is due to MPEG4 being qualitatively superior to MPEG2. That’s it.
      Reply
      - Mans
        
        September 4, 2015 at 5:28 am
        Permalink
        
        At a sufficiently high bitrate, MPEG2 and MPEG4 are indistinguishable. Many early BluRay discs are actually encoded in MPEG2 with bitrates up to 40Mbps (the maximum bitrate regardless of codec). DVDs are limited to about 10Mbps including audio (most commercial DVDs use 6-7Mbps for video), so the number of compressed bits per pixel is roughly the same between the two formats. The improved picture quality of such a BluRay is thus entirely a result of having more pixels. Using a more advanced codec like H.264 (aka MPEG4 part 10) raises the picture quality at a given bitrate (or lowers the bitrate for a given quality), which of course also contributes to the improvement over DVD, though at the bitrates used on BluRay (typically 20-30Mbps), the differences are subtle and generally only visible in particularly challenging scenes.
Mans

September 2, 2015 at 2:55 pm
Permalink

There are a number of analogies one can make between audio and imaging.

To begin with, let us compare a monophonic audio recording with a monochrome image. A digital audio recording is made by sampling the continuous analogue waveform at fixed time intervals. Similarly, a digital image is made by sampling the continuous analogue image at fixed space intervals (pixels).

For each sample, the audio recording stores the air pressure level at the corresponding point in time, while the image samples store the light intensity at the corresponding points in space. In both cases, the number of bits per sample determines the accuracy with which we can record the air pressure and light intensity, respectively.

Increasing the audio sample rate allows us to record higher-frequency sounds. The image equivalent is to increase the number of pixels per unit length, allowing us to capture higher spatial frequencies (smaller details). Just like an audio ADC requires an anti-aliasing filter, so does a digital camera (and it has the same name). In front of the image sensor, there is a filter which blurs the image ever so slightly. Without it, the captured image would have spatial frequencies higher than half the pixel interval (Nyquist) show up as low-frequency aliases known as moire patterns.

In audio, we can use a stereo pair to reproduce an illusion of sound emanating from anywhere between the speakers. In imaging, a small number of colours (red, green, blue) provide an illusion of a full spectrum. Here a correspondence can be seen between the terms sound stage and colour gamut.

On the reproduction end, an audio DAC converts a digital signal back to an analogue electrical waveform, which is then turned into pressure waves by a speaker. A display device, similarly, converts the digital image into electrical levels which in turn produce light of the desired intensity (whether directly or by attenuating a backlight). VGA graphics cards actually have a part called a RAMDAC which converts a digital image to analogue electrical signals one pixel at a time.

Also in processing, many algorithms are shared across the domains. To resize an image, one uses a resampling filter very similar in design to audio resampling filters. If the bit depth is constrained, both audio and image processing use dithering and noise shaping to produce the best possible approximation.

This brings us to DSD. This manner of representing audio is in fact a close relative of the halftone printing technique. In the latter, the lightness of an area in the image is determined by the density of tiny ink dots present there. In other words, it is a form of pulse density modulation. A variant of halftone uses differently sized dots, corresponding to pulse width modulation in audio. As a reproduction method, halftone printing enables the printing of pretty good images using cheap equipment. However, nobody in their right mind would suggest such a representation for a primary storage format, let alone try to use it in image editing.

Similarities can further be found in lossy compression methods. Both in audio and imaging, these are based on the Fourier transform (usually in its cosine transform variant). Heavy JPEG compression introduces artefacts, ringing, especially around sharp edges (transients in audio).

This is the beauty of digital signal processing. The algorithms see only a progression of numbers with complete disregard to what those numbers represent in the material world.

Finally, audio/image analogies are not limited to the digital domain. Tape hiss and film grain are in fact very similar in nature, the former the result of magnetic domains in the tape, and the latter arising from crystals of photosensitive chemicals.
Reply
- Admin
  
  September 2, 2015 at 6:18 pm
  Permalink
  
  You make some very good points and thanks for taking the time to lay them out. My primary reason for the post I wrote was to hopefully point out that comparing a fuzzy compressed lo-res fuzzy image to an MP3 files is a bad analogy. And then to say that increasing the number of bits to 24-bits to improve the picture is analogous to improving the “resolution” of an audio recording fails on almost all counts. Bit depth vs. pixel count vs. amplitude levels and audio digitization don’t operate in the same space. I believe there is a better way.
  Reply
  - Jay
    
    September 3, 2015 at 9:31 am
    Permalink
    
    As you can both hear & see, timing transient characteristic dominates either audio & video. The number of pixels per unit length in video is equivalent to audio soundstage. And more colours in video is same as more frequencies in audio. Audio resolution depends on sampling only!
    Reply
- Phil Olenick
  
  September 2, 2015 at 8:30 pm
  Permalink
  
  Mans, as a digital photographer – who started out with film – and digital audio fan who started out with LPs and tape – I see your analogies as more intriguing than anything I’ve run across. Your comparison of SACD encoding to halftone printing is fascinating. And this is also the first time I’ve ever seen the analogy made between the anti-aliasing filter in DSLRs and the low-pass filters needed to comply with the Nyquist limit.
  
  Mindblowing.
  Reply
- Mark U
  
  September 3, 2015 at 11:17 am
  Permalink
  
  Mans – With some of the latest very high resolution cameras, such as the 50.6 MP Canon 5DS R, low-pass filter cancellation is used to tradeoff higher resolution for a low but real risk of moire. With digital audio I doubt that completely bypassing the low-pass filter would ever make sense.
  Reply
  - Admin
    
    September 3, 2015 at 11:43 am
    Permalink
    
    Most definitely not.
    Reply
  - Mans
    
    September 4, 2015 at 4:48 am
    Permalink
    
    Omitting the anti-aliasing filter in a high-resolution digital camera is feasible because the types of patterns that would cause visible problems (finely spaced lines/grids) are rare in nature. Brick walls would be a problem for a low-resolution camera, but at 50MP that is not the case. Aliasing from irregular patterns, such as vegetation, doesn’t tend to be visually very noticeable.
    
    In audio, frequency aliasing is much more disturbing, so the acceptable amount is much lower (to the extent such a comparison is meaningful at all). Now the microphone itself acts as a lowpass filter for some frequency determined by various factors including the mass its moving parts, impedance of analogue signal wires, etc. I don’t know what a typical figure might be, but I doubt it’s more than 1MHz. Supposing we could make an infinitely fast 24-bit flash converter, operating it at a frequency twice as high as the upper limit of the microphone, we would not need any additional analogue filters. In practice, we use oversampling low bit-depth sigma-delta converters, which increases the required physical sampling rate accordingly, so if we could sample at, say, 100MHz, there would be no need for explicit lowpass filters. As we’d obviously still need a digital filter to convert the raw samples into useful PCM data, there is nothing to be gained from going to such extremes.
    Reply

Resolution And The Pixel Analogy

Dr. AIX

24 thoughts on “Resolution And The Pixel Analogy”

Leave a Reply to Grant Cancel reply