A | B Testing At Home

February 24, 2015 Dr. AIX

I have a beautiful studio here at AIX Records. The main room was deliberately designed to be similar to a very nice home theater. The control rooms of most professional studios are smaller than mine. It’s 30 feet long, 24 feet wide, and 12 feet high with a custom diffuser located at the center of the ceiling. I have 5 B&W 801 Matrix III speakers mounted on anchor stands (filled with lead) positioned in a correct ITU 5.1 array. The distance from the listening position to the speakers is about 7 feet. The room sounds amazing.

It would not surprise me to learn that many of you have rooms that are similarly equipped, similar in size, and sound great as well. You may have better cables (although Audience and Cardas were kind enough to provide me with some of their best products when I built the place), more current model speakers, and more “audiophile” brands in your rooms but I’m pretty confident that you don’t have a large format all digital high-resolution console in your space. This is the piece of equipment that allows me to do things that are very difficult to accomplish in a home setup.

The recent discussion of expensive cables and the assertion by some readers that they clearly hear a difference between digital cables (S/P DIF, Ethernet, etc) got me thinking about just how are they doing their tests? If I tried to do this type of comparison at home using my Oppo player, Yamaha AVR, and B&W FCM-8 speakers, I would have to play a track and then make changes to the setup (replace a cable etc) before listening again. In the studio, I can simply hit a button on the console and instantly switch between one stream and another. And those streams could be completely different formats (MP3 vs. CD vs. DVD-A etc). There is a very expensive format convertor in my machine room that makes this possible. I doubt whether too many home theaters or listening rooms have something similar.

So how people doing comparisons at home? When someone lends you a set of expensive interconnects, how do you evaluate them against your current setup? If your optical disc player has two identical coaxial outputs and you can connect two different quality digital cables to two inputs of your external DAC, then you could switch between the inputs and do a reasonably good comparison. But which optical disc players have two coax outputs? On my console, I can digitally “mult” any digital signal as many times as I need, but at home this is much tougher if not impossible to do.

The same problem applies to the CAT-6 cable I wrote about the other day. Without identical systems from source to speaker that can easily switch from one cable to another without any re-plugging, I don’t think you can actually to a comparison that holds up. If I’m wrong please explain to me your approach.

The recent “High Res” study that I was involved in that tried to compare 256 kbps MP3 files to high-resolution 96 kHz/24-bit ones had this problem. I was prepared to use my laptop and professional PT DAW to switch randomly between the two streams. As it turned out, I was forced to abandon that method and they simply played the files from another computer one after the other. The participants were expected to have sufficient sonic memory to be able to compare the two. They got to hear a minute of one track and then another minute of the other track. No one would be able to hear the difference and the results that we got bore that out. It’s hard enough to hear any difference when you have a great system setup.

I don’t doubt that many audiophiles believe they “hear” a difference when a new piece of amazing gear is inserted into their systems. But I’m not convinced that they are actually receiving anything different.

There are well-understood protocols for doing A:B testing. They cannot be ignored. They were developed because any single broken protocol allows for false readings, or reduces acuity.

At-home A:B comparisons almost always break *many* of the protocols. They are completely useless. They are worse than useless: they are highly destructive and misleading, if the goal is to know something true about the device under test.

On the other hand, many in-home A:B comparisons have a more humble goal, namely, to find out what the owner himself (or herself) ‘feels like he enjoys more’, taking into account all the placebo, expectation, and unconscious effects of prior knowledge. This is a perfectly valid test, in fact I use it myself and recommend it. After all, if a certain device gives you a certain feeling that you don’t like, why install it just because the feeling was caused by your mind internally and only falsely attributed to the device? Shades of cutting off one’s nose to spite one’s face, there! And, if you can’t help yourself and you get all happy at the sight or very knowledge of an electron tube in your circuitry, even though you know it isn’t improving things technically, for goodness’ sake get one! Or two dozen!! It’s reality that we have our individual reactions of this sort, and I say it’s common sense to build them into our equipment purchase decisions. Every time we use them at home, it will never be under blind conditions, so our ‘perceptual colourations’ will be in force and should have been taken into account in the purchase decision, not ignored.

It is just a little unfortunate that most audiophiles don’t understand the limits of the above typical in-home test, and mistakenly attribute all their perceptions to the ‘device under test’. They understand so little about what really happened in their demo or ‘open test’, that they become utterly convinced and embark on missionary quests to share this ‘objective knowledge’ they have with all and sundry, when in fact their procedure was completely wrong for such a conclusion.

As an aside, re:cables, there is no reason why cables can’t sound different if they have different R, L and C properties. That’s a common engineering fact. The ‘snake oil’ bit is to claim they sound different when their RLC properties have been matched, or when they are in an application where the signal transform is insensitive to RLC properties.

32 thoughts on “A | B Testing At Home”

Herb B

February 24, 2015 at 1:53 pm

Thanks for the sanity. I have walked out of Synergistic Research demos when the BS got too deep.
- Grant
  
  February 24, 2015 at 6:23 pm
  
  Waded out, you mean. 🙂
Eldon Doucet

February 24, 2015 at 2:00 pm

My friend and I have done many, many A/B cable comparisons simply by listening, changing the component, and re-listening. I know some say it’s hard to ‘remember’ what you heard 60 sec. ago, but we usually can.
We’ve done this for power chords, interconnects and speaker wires. And the better ones DO make a difference.
My most recent upgrade was from Morrow SP-5 speakers wires to SP-6. When the new ones were broken in a bit, we swapped back and forth (as well as with a lower model SP-4) and we both could hear the difference. We usually listen very closely to female vocals, such as Sarah McLachlan – Angel or Eva Cassidy – Fields of Gold. And these aren’t any high-res files. We can hear the ‘S’ sounds get clearer, less harshness in the voice, etc.
Every little bit helps. Same with a better DAC. My friend has the Wired 4 Sound DAC-2 and I have the DAC-2 DSD SE. BIG difference, and also in price. So what the manufacturer does in the better DAC (“Enhancing the already stellar performance of our DAC-2 series, the DAC-2 DSDse includes an array of highly upgraded components: Vishay Z-Foil resistors, ultra-low noise discrete regulators, ultra-fast recovery Scottkey diodes, premium grade inductors, green OLED display and a Rhodium-plated Furutech fuse. The culmination of these enhancements is refined audio performance down to the minutest nuances”) really makes the music sound better. But you pay double the price.
So from power chords to speakers wires to interconnects, from DACs to tube pre’s and amps, every piece makes a difference. The question is where do you invest and where do you start? (I could write a whole article on this topic, but I know it’s been done before).
To see where I’ve gone, check out my Home Theatre/Music page at : http://www.edoucet.com/hometheatre.htm
- Admin
  
  February 24, 2015 at 2:10 pm
  
  It’s pretty hard for me to imagine that you can do a real test this way….but I appreciate your sharing.
Robert McAdam

February 24, 2015 at 2:01 pm

Possibly the best reliable test is to only change a cable or component after spending some time listening to that system without changing anything. I mean anything from a month to years.

I recently upgraded my interconnects from Nordost Heimdall which I’ve had in place for 5 years to Heimdall MK2 late last year. These are between DAC and power amplifiers. The upgrade was immediately apparent upon first listen, more detail, more music and this fresh out of the box. I didn’t bother to go back to the originals having lived with them for 5 years.

The downside to this action is do I need or want to now upgrade all my power cords to Heimdall 2 from the earlier Vishnu’s. I suspect I will as by all accounts I will hear further upgrades in sound.

I have not tried to compare USB digital cables and still use the supplied Benchmark USB since buying the DAC2. Digital comparisons do seem fraught with difficulties but using my situation above using one cable over a period of time may help when changing to another.
- FB
  
  February 24, 2015 at 2:46 pm
  
  The placebo effect is a tough thing to handle.
Jay

February 24, 2015 at 2:20 pm

Just listened to folk: it was just a 56 times oversampled heavily noise-shape-dithered 44.1 kHz 16 bits track .

No “high-definition” equipment at all. The speakers go up to only 20 kHz. Normal volume level: 90° switch position.

At approx. 4 metres off the 10-watts speakers the VOCAL PEAKS WERE INTOLERABLE FOR EARS as during a live event ! ! !

The music itself was very transparent, refined, filling the whole room.
Dave

February 24, 2015 at 3:25 pm

The more I tried, the A/B experience was”different”. Louder, softer, dynamics, what’s the “difference” ? Most philes really get caught up in staying the media projected course. The rest really becomes purchase justification and the pschothery that comes with “belonging”. At the end of the day, everything becomes the new “norm”, not something profound. With Itrax files, I feel the organic recording is musically relaxed, not contrived or embellished. It is easy on the ears, not punctuated with enhancements. Yes, it simply flows. How’s that ? No techo jargon.
Fazal Majid

February 24, 2015 at 3:56 pm

I wonder how much of the perceived effect of cables is just confirmation bias vs. differences in the impedance of the cables (specially for speaker cables) and failure to level-match. The quality of shielding on unbalanced cables may also make a difference, but should be negligible for typical runs.

The one time I did an A|B test of two DACs (Benchmark DAC1 vs. Anedio D2), I pulled out my Fluke 289 True RMS multimeter to level-match to within 0.1V, and played through Sennheiser HD800 headphones. Even then, it was not a perfect test as the voltage drop of the 2 amplifier sections against the impedance of the headphones would not necessarily be the same as against the much higher impedance of the multimeter.

More relevant to your use case about HR audio, I once ran a test between the SACD and the Redbook version of the Chandos/LSO/Hickox Vaughan Williams Pastoral Symphony and Norfolk rhapsodies (for some reason I had both the CD CHAN 10001 and SACD CHSA 5002). I used a Marantz SA8260 SACD player and Sennheiser HD800 connected directly to the player’s headphone outs. The A|B procedure was to shuffle the discs with my eyes closed, insert them by touch and play. At first, I couldn’t reliably distinguish them, but after a period of careful listening, and focusing on the timbre of the instruments, I got to about 60%-70% accuracy. Now, as you point out, the mixes may not be the same, but my take-away is the differences are exceedingly subtle. It may also be more audible on vocal recordings or solo instruments.

In any case, none of this matters given the dire state of record mastering and mixing practices. I am fortunate to listen mostly to classical music, a genre that has not been as hopelessly butchered as more popular music by heavy-handed dynamic range compression in the loudness wars. The general public is not so lucky and what we really need is not higher-bitrate recordings but a label that licenses rights to the music and releases more delicately mastered versions, sort of like a Criterion Collection of music. Pricing would have to be higher so the major labels would be willing to sublicense those rights, e.g. Criterion Music would pay the right holders the full retail price of a CD, then charge about 2x that to recover their own costs.
Dean

February 24, 2015 at 4:00 pm

Mark there is a piece of windows software called RightMark Audio Analyzer and from the brief try I had you play a sound file with RightMark Audio Analyzer through your system and it records it, then you make a change to your system and play the file again and you can compare I think 6-8 tests for things like Frequency Response, Noise Level, THD, IMD. There is a free version. It is probably easiest to check out the short User Guide at http://audio.rightmark.org/download.shtml
- Admin
  
  February 24, 2015 at 5:41 pm
  
  Thanks I’ll take a look.
Barry Kirby

February 24, 2015 at 4:22 pm

I have to agree that the quick change from one mode or device to another is the best way to compare. My Anthem Logic D2V has many inputs & outputs, that allow comparisons between digital & analog sources, but cables are another matter. I enjoy XLR over RCA cables, but maybe it’s the gain or volume, I’m interpreting as superior. I find that the difference between digital vs analog hookup comes down to which processor I prefer- The source processor or the D2V processor. I can synchronize my CD player & my DAC in analog & determine which is most enjoyable. Sometimes listening to the same music with a different emphasis is pleasant.
Dave G

February 24, 2015 at 4:24 pm

The problem with a B testing is that if you know what the components are you’re still influenced by what is known as expectation bias. So if you have just paid $1000 for a power cord, you’re not going to want to admit to yourself that it doesn’t make any difference in the sound. An AB/X box (such as that available from Audio by Van Alstine for about a grand) allows you to switch between source material, front end, power amp, interconnects etc. It removes any ‘expectation bias’ on the part of either the switcher or the listener. It can tell you whether the difference between those two power cords is really as ‘ ight and day’ as it was when you knew whic cost $1K and which cost $7.99 from Ace Hardware. Unsurprisingly, such scientific methodology is shunned by today’s audiophiles.

For the record, I believe there are differences between components but that they are exaggerated all to hell by expectation-bias-polluted testing. An interesting article discussing the power of expectation bias is ‘What We Hear’ on the nwavguy blog (Google it, it’s a good read).
- Admin
  
  February 24, 2015 at 5:44 pm
  
  Thanks.
Édouard Trépanier

February 24, 2015 at 4:32 pm

I am probably a just « slow listener! »

One minute is not enough for me to assess subtle quality differences. My analytical listening requires first that I feel comfortable and at rest. This takes about 20 minutes of listening to an audio set up. And then, I prefer to listen to a complete piece of music so that I can appreciate the frequency responses (I plot a graph in my head), the dynamics with the quickness of the attacks, the instruments positions and size, the depth of field (2D is not involving), the synchronization between the transducers and finally, the non-descript “feel” I call emotion. This takes at least 4 or 5 minutes.

That only to suggest that you allow more than one minute to your guinea pigs for « the big test ».
- Admin
  
  February 24, 2015 at 5:45 pm
  
  When I do this in my studio, I let people take as much time as they like. I know I react the same way…it can a whole day of listening to get the right feel.
Sal

February 24, 2015 at 5:10 pm

Boy your really trying to stir the snake oil pot today Mark. LOL

What I’d really like to see is a nice set hi rez photos of your complete studio, a virtual tour so to speak.

No subs to support the bottom of the B&Ws specially when doing video with LFE channel?
- Admin
  
  February 24, 2015 at 5:46 pm
  
  Yes, I do have a sub…a TMH Labs profunder. You can see photos of the studio at aixstudios.com
  - Sal
    
    February 25, 2015 at 8:38 am
    
    Thanks, beautiful setup!
    GOD I’d love to hear that JBL system, I’m a horny sort of guy. 🙂
Grant

February 24, 2015 at 7:02 pm

There are well-understood protocols for doing A:B testing. They cannot be ignored. They were developed because any single broken protocol allows for false readings, or reduces acuity.

At-home A:B comparisons almost always break *many* of the protocols. They are completely useless. They are worse than useless: they are highly destructive and misleading, if the goal is to know something true about the device under test.

On the other hand, many in-home A:B comparisons have a more humble goal, namely, to find out what the owner himself (or herself) ‘feels like he enjoys more’, taking into account all the placebo, expectation, and unconscious effects of prior knowledge. This is a perfectly valid test, in fact I use it myself and recommend it. After all, if a certain device gives you a certain feeling that you don’t like, why install it just because the feeling was caused by your mind internally and only falsely attributed to the device? Shades of cutting off one’s nose to spite one’s face, there! And, if you can’t help yourself and you get all happy at the sight or very knowledge of an electron tube in your circuitry, even though you know it isn’t improving things technically, for goodness’ sake get one! Or two dozen!! It’s reality that we have our individual reactions of this sort, and I say it’s common sense to build them into our equipment purchase decisions. Every time we use them at home, it will never be under blind conditions, so our ‘perceptual colourations’ will be in force and should have been taken into account in the purchase decision, not ignored.

It is just a little unfortunate that most audiophiles don’t understand the limits of the above typical in-home test, and mistakenly attribute all their perceptions to the ‘device under test’. They understand so little about what really happened in their demo or ‘open test’, that they become utterly convinced and embark on missionary quests to share this ‘objective knowledge’ they have with all and sundry, when in fact their procedure was completely wrong for such a conclusion.

As an aside, re:cables, there is no reason why cables can’t sound different if they have different R, L and C properties. That’s a common engineering fact. The ‘snake oil’ bit is to claim they sound different when their RLC properties have been matched, or when they are in an application where the signal transform is insensitive to RLC properties.
Ron Chervinski

February 24, 2015 at 8:18 pm

I find that I can remember and compare sounds much better in total darkness.
if I close my eyes it is different that having them open in a totally dark room.
Warren

February 24, 2015 at 10:57 pm

Hi Mark,
My listening room is 30′ x 19’3″ x 10′. I wish it was 12′ high. Following “Golden Rules” does not always stand up well in a surround room. I would do things differently next time.
I cue up two pieces of equipment and A/B them with my “Audient” 5.1 surround controller. It’s great because it’s instantanious.
Regards to all,
Warren
Paul Duggan

February 25, 2015 at 3:14 am

No matter how good is the methodology for an A/B test there will always be cracks that prevent you from being able to argue that it’s definitively, objectively conclusive which. IMHO there is very little mileage in spending a lot of effort on A/B tests beyond identifying gross faults or what is your subjective preference.

My feeling is that the only testing that is truly worthwhile is objective measurement combined with our current best understanding physiology. Audiophiles hate this idea because it doesn’t back-up what they hear so they need to resort to magical-type thinking (easily spotted, but they won’t change their minds; leave them to it). High-end manufacturers hate it too because (certainly when we’re talking about digital technology and media) it tells us that any measurable differences are well beyond levels of human perception.
Kevon Manuel

February 25, 2015 at 6:56 am

Question Mark: If you were to take one of your blu-ray/dvd-a recordings and you were to strip out all frequencies below 22khz and played it through a system that is capable of playing signals beyond 60Khz, What would people actually hear?
Alan

February 25, 2015 at 9:17 am

Dr.AIX,
My comments may lack some pertinence because I never audition cables that are out of my stretched price range, and they are usually analog. In any case I rarely AB cables in the usual sense (rapid switching back and forth). Instead i insert a cable or other component and listen to it for several days to try and get use to its sonic signature. That is unless it is so different that I hear significant differences right away; even then I do not trust great immediate improvements. So often I have found that improvements that “blow me away” eventually have negative impact over the long haul. My best experiences occur with equipment that doesn’t sound significantly different at first, but over the longer term don’t show any weaknesses and I began hearing improvements gradually; perhaps quiet sounds in the background of which I was previously unaware.
In short, except on rare occasion I’ve not been very good at quick comparisons of subtle differences. Even when I hear a difference, I cannot generally tell which is best, at least not right away. I can give you one example, however of one unmistakable difference between digital transfer. I own the PS Audio PWT (transport) and matching PWD (DAC); there are several ways to connect the two digitally. My comparison involves the coaxial connection vs something called I2S which sends the digital stream somewhat differently than through the coax. Nevertheless they arrive at the PWD and go through the conversion process. My point is, they sound distinctly different; I can only describe the differences as the I2S connection sounds less smeared and slightly drier, but more detailed (not always for the better, either).
- Admin
  
  February 25, 2015 at 10:05 am
  
  Thanks Alan…if you’ve got some of Paul’s gear, you’re a serious audiophile. His stuff is not inexpensive and is capable of producing great sound. In comparing two digital streams, I would simply capture the output stream from both formats and compare them bit for bit. If they are the same then they will sound the same…if one is substantially different from the other than a process was applied at some point in the transmission. I wouldn’t want someone messing with my bits through a “straight” digital transfer.
  - Andrea
    
    February 25, 2015 at 7:09 pm
    
    I2S maintains the clock separate from the data, whereas in S/PDIF the clock must be recovered from the data signal. That makes I2S less subject to jitter. Furthermore, the amount of jitter in the clock recovered from an S/PDIF datastream can be influenced by the quality of the cable used. (Many cables, from the very inexpensive to the astronomically priced, are far off from the desired impedance.) Jitter is both audible and measurable. Recording both data streams in the digital domain, and comparing them, however, completely decouples the data from the clock, so you will not be able to measure any difference. In fact, reading the data into a buffer is how some DACs eliminate the effects of jitter.
    - Admin
      
      February 26, 2015 at 10:18 am
      
      Thanks for the explanation. But don’t both methods of digital data transmission travel down a single cable? And don’t the DACs or subsequent devices (of a certain quality) reclock the signals and arrive at the same jitter spec (if the clock are of equal quality)? Jitter is a problem in digital audio but it’s been reduced in quality hifi to a point that it doesn’t degrade the signals much at all.
      - Andrea
        
        February 27, 2015 at 4:42 am
        
        I2S requires three separate signal lines. If you want to use a single physical cable, that cable must have multiple conductors – e.g. HDMI. It was designed for use over very short distances on a circuit board – not between components. Used between components, it may or may not be better, but at least there are sound engineering reasons for believing it might perform differently than S/PDIF.
        
        It’s true that there are various methods of reducing jitter, but the only way, of which I am aware, to entirely eliminate transport and transmission generated jitter is to store the received data in a buffer and feed it to the DAC from an independent clock. On average, DACs’ jitter performance has improved in recent years, but browsing through John Atkinson’s measurements in Stereophile shows that good jitter reduction still can’t be taken for granted.
Leo

February 26, 2015 at 7:03 am

Mark, if I recall correctly you once tested a ‘snake oil’ product by switching the phase of the signal, and playing it together with the original signal. If there’s any difference at all, you’d at least see (on a scope) or hear some noise. If there’s no difference the result should be silence.
I thought this was a very elegant way of comparing A and B, avoiding all the pitfalls of psycho-acoustics. Shouldn’t it be possible to test digital cables this way?
- Admin
  
  February 26, 2015 at 10:26 am
  
  I have done this number of times…most recently with a $3000 power cord and a standard $1.00 power cord. The output from my system was identical…meaning there was no effect…no improvement by investing in the expensive cable. I do have good power here and a proper ground…but then again most homes do as well. I’ve tested S/PDIF cables and USB cables the same way…again not differences.
Dennis

February 26, 2015 at 2:55 pm

On parts of the signal chain where it is possible, I record at 96/24 into a good ADC. Then you can take the digital files and ABX them in Foobar or similar software. You are reduced to the resolution of your ADC. You of course have to be exceedingly careful about level matching, but should do that anyway even in sighted AB comparisons. Those disagreeing will always say the differences are smaller than your ADC is capable of recording. While true, some mighty small differences won’t escape my ADC. At worst you will have determined the difference is awfully awfully small. Using those same files to subtract one from the other works much more simply until you start seeing differences leftover.

Dr. AIX

32 thoughts on “A | B Testing At Home”

Leave a Reply to Kevon Manuel Cancel reply