Jump to content

How to do a double blind test (DBT) that is not to be doubted (TBD)


Recommended Posts

Recently I explored how blind tests are done in audio and the results were a bit sad. Basically if the test was done outside of a university or academic setting they might as well have just got fish and chips and sat on the beach cos it would have been just as useful and a whole lot more pleasant.

 

You can read the gory details here 

 

 

If there is anything I would take out of that thread and put here would be this (basically how to do it) https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1116-3-201502-I!!PDF-E.pdf

 it doesn't give you all the answers but certainly its a well signposted path. 

 

In that thread some were interested in how one might do it at home.  So rather than start the good bit on the 7th page I thought it best to start a new thread. 

 

I dont really want to drive the agenda (or do lots of typing) so have a read of the above and make suggestions, or ask questions, (but maybe dont tell me how bad sighted listening is as I am still triggered from the last thread)

 

 

  • Like 3
Link to post
Share on other sites
  • Replies 998
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

Short break for a proud dad moment, was watching my daughter compete in the All-State Jazz Competition in Melbourne. Man I wish I could have played in such an accomplished band at 15. There are 3 Jazz

Indeed.... but that isn't what MLXXX said.   He just said there wasn't any "hard" evidence presented.   It is irrational for anyone to believe that two signal are different from each other,

Strangely enough, I listen to music for the response in me.   The way that I check for "better" is: Am I enjoying the music more? Do I want to turn it up or put on more?  Am I

Just a very quick comment (paralleling a point I made in the other thread):-

 

If the audible difference seems to be very marked and very easy to hear, then before embarking on  the design and execution of a full-blown DBT "that is not to be doubted", simply try a brief informal blind test.  

 

If that gives a promising result, then by all means consider a longer, more formal, blind test.

  • Like 2
Link to post
Share on other sites

Thank you Mark for taking the time to do this exceptional body of work. I expect few, if any, of those who have repeatedly used flawed DBT testing results to put down and ridicule others experiences would agree.
 

Perhaps this thread showing what has to be done for a DBT to have any chance of accuracy could be made into a sticky to live in perpetuity at the top of this The Great Audio Debate page so it can be used as a reference for any future DBT debates.

 

cheers,

Terry

Edited by TerryO
Link to post
Share on other sites
2 hours ago, TerryO said:

I expect few, if any, of those who have repeatedly used flawed DBT testing results to put down and ridicule others experiences would agree.

Precisely ?

Link to post
Share on other sites

4 hours ago, TerryO said:

Thank you Mark for taking the time to do this exceptional body of work. I expect few, if any, of those who have repeatedly used flawed DBT testing results to put down and ridicule others experiences would agree.
 

Perhaps this thread showing what has to be done for a DBT to have any chance of accuracy could be made into a sticky to live in perpetuity at the top of this The Great Audio Debate page so it can be used as a reference for any future DBT debates.

 

cheers,

Terry

 

 

and that is one of the misinterpretations that I was afraid would come from the other thread.    Hopefully this thread will show that meaningful and useful results can indeed result from less rigourous tests using simple logical methods.

Link to post
Share on other sites

When I was doing all my BT tests for months, I was reading a lot on the subject and did find Alan Shaw's and his Harbeth forum a wealth of information on the topic of DBT.

He has been doing them all his time in speaker building, so I would class him a bit of expert in the topic.

It opened my mind to why when I did them, everything sounded the same. This was mainly when the differences where smaller ones.

 

Now as I try and find the info online it seems the forum has changed since and the info is no longer there. I did post on here (SNA) about it and can not find that either.

 

So from my memory here is some points I remember that seemed important to Alan. Take it for what it is as I can no longer find the references. As hearsay as it's just my recollection.

 

It was important that the change over between listening to each be for at most 2 seconds, no longer, but the quicker the better. Msecs is the best. The closer to 2 seconds the harder to hear differences.

State of mind is important. Relaxed and rested as Alan mentioned he would walk away from the tests many days as he was not in the correct mindset.

Fatigue can come on quickly too, where it can again, all blur together.

He mentioned to concentrate on one aspect. Like a voice, instrument etc and make the changes quickly as not listen for too long in each segment. He mentioned he never listened to a whole song but kept it very short between swaps.

 

It was something to do with the mind being very good at filling in gaps, so hearing smaller differences is quite hard to do even for him in his tests.

All the reading I did at the time lead me to realization that DBT's where not for me as I was getting no where with my flawed tests. I would get mind fatigue very quickly and it was not like I normally would sit there and listen to music but with these tests, you have to concentrate right in on a single aspect.

 

I'm sure there was a lot more info from Alan, but that was my main takeaway that I recall at the moment.

 

 

 

 

Link to post
Share on other sites

I would prefer to keep this thread very on topic and any gripes or squabbles associated with the other thread should be addressed there.  I think all perspectives are already well represented above so please no more. If you think this thread is ridiculous or a waste of time please argue your point on the other thread as that is where the discussion is at. And of course if this topic does not interest you there are many others for you to spend your time on. I wont hesitate in asking for the removal of any further comments which are not on topic as it just wastes too much time and effort.

  • Like 1
Link to post
Share on other sites

OK,  so what do we think of the method sometimes used,  where someone posts some audio tracks designed to enable the comparison of something they wish to examine.   There is no doubt that this is a blind test, and unless the publisher of the files says something, or names them badly, there is little opportunity to bias the audience.

 

A trivial example might be  different types of interconnect cable, or different DACs, etc etc.  Maybe 3 or more different examples. 

 

You get to play them back and listen, synchronise and switch between them, whatever you like, via whatever equipment you like.   This allows a large group to participate, and would test across varying playback equipment in varying situations.

 

The required response might might be to answer some question like,  "Is there a difference?"  or "Which is better?" depending on what the tester is trying to determine.

 

 

Link to post
Share on other sites

My apologies Mark I started it so the blame rests with me. 
 

cheers,

Terry

  • Like 1
Link to post
Share on other sites
3 hours ago, aussievintage said:

OK,  so what do we think of the method sometimes used,  where someone posts some audio tracks designed to enable the comparison of something they wish to examine.   There is no doubt that this is a blind test, and unless the publisher of the files says something, or names them badly, there is little opportunity to bias the audience.

 

A trivial example might be  different types of interconnect cable, or different DACs, etc etc.  Maybe 3 or more different examples. 

 

You get to play them back and listen, synchronise and switch between them, whatever you like, via whatever equipment you like.   This allows a large group to participate, and would test across varying playback equipment in varying situations.

 

The required response might might be to answer some question like,  "Is there a difference?"  or "Which is better?" depending on what the tester is trying to determine.

 

 

 

Thanks for  the comment. More generally and not specific to the above comment I fully appreciate any and all responses that are genuinely having a go.  I will not be chiding anyone for that no matter how "wrong" it may seem.  I dont have an agenda with this thread and am learning along with you all so expect and encourage others to challenge my comments if its not adding up for you.

I dont consider myself an expert in this and will be referring back to the literature for guidance.

 

Now to the above comments. 

It may be a reasonable way to perform a test but there may be better as the standard does talk about not having a long delay between samples, so that might be a good thing to try and build into the samples.

 

Not sure what you mean by trivial, I would not consider passing a blind test which examined one of these aspects as a trivial exercise.

Why 3 or more? Are you saying the differences are so easy to discriminate that we can throw more than 2 in?

 

I dont think Arnold Schwarzenegger started his weight lifting career by bench pressing 500lbs on his first day, so I wonder if we can build our listening muscles by practicing the tests below over a period of time and see if we improve. Some I find easier than others on my first try.

 

https://www.audiocheck.net/blindtests_index.php

 

Interested in others you might know of.

 

 

 

Link to post
Share on other sites
3 minutes ago, frednork said:

the standard does talk about not having a long delay between samples

 

Yep, that's why I mentioned synching them and switching between them.  It's trivial to do it free programs like Audacity.

 

4 minutes ago, frednork said:

Not sure what you mean by trivial, I would not consider passing a blind test which examined one of these aspects as a trivial exercise.

 

No, perhaps I should have said simple or easily defined and set up.    I also meant that it's trivial to understand the situation, just trying to keep it uncomplicated as an example.

 

6 minutes ago, frednork said:

Why 3 or more? Are you saying the differences are so easy to discriminate that we can throw more than 2 in?

 

No,  and two would be fine.   It's just that there are so many competing examples ( e.g. different brands of cables),  I wanted to allow that multi-way testing is still easy to do.   

 

8 minutes ago, frednork said:

I wonder if we can build our listening muscles by practicing the tests below over a period of time and see if we improve.

 

I believe training can help, but whether just repeating tests, maybe making the same mistakes over and over, will help, I have doubts.

Link to post
Share on other sites
23 minutes ago, frednork said:

I would not consider passing a blind test

 

Can I offer a suggestion that we stop using this expression -  "passing"?  It isn't about passing or failing.  It's just a test that produces results which help us understand the situation better.  All results a valid within the details of the test, and even if not valid because of some inadequacy of the test,  the person participating in the test NEVER fails.

Edited by aussievintage
  • Like 3
Link to post
Share on other sites

4 minutes ago, aussievintage said:

 

Yep, that's why I mentioned synching them and switching between them.  It's trivial to do it free programs like Audacity.

So does that rely on the user to make sure they are synced properly. In the back of my mind I am wondering if we can conduct a panel training and testing exercise without having to go anywhere and have the benefit of our own chosen setup. For this to work the samples must be indistinguishable or somehow played  without allowing analysis to prevent people using analysers to do the listening for them. So something as easy as possible to setup would be good.

4 minutes ago, aussievintage said:

No, perhaps I should have said simple or easily defined and set up.    I also meant that it's trivial to understand the situation, just trying to keep it uncomplicated as an example.

Sure understand now

4 minutes ago, aussievintage said:

No,  and two would be fine.   It's just that there are so many competing examples ( e.g. different brands of cables),  I wanted to allow that multi-way testing is still easy to do.   

I think once you have reliably discriminated between 2 samples then you can try more samples if you want

4 minutes ago, aussievintage said:

I believe training can help, but whether just repeating tests, maybe making the same mistakes over and over, will help, I have doubts.

The standard does seem to say that doing tests does improve ones ability to discriminate and this is backed up by Tooles comments. Of course the closer the tests are to the difference you expect the better.

 

6 minutes ago, aussievintage said:

 

Can I offer a suggestion that we stop using this expression -  "passing"?  It isn't about passing or failing.  It's just a test that produces results which help us understand the situation better.  All results a valid within the details of the test, and even if not valid because of some inadequacy of the test,  the person participating in the test NEVER fails.

Sure, what would you offer as an alternative to describe that situation?

Link to post
Share on other sites
12 minutes ago, frednork said:

So does that rely on the user to make sure they are synced properly. In the back of my mind I am wondering if we can conduct a panel training and testing exercise without having to go anywhere and have the benefit of our own chosen setup.

 

I think that's overthinking it.  Just as some participants will be better at hearing things than others,  other participants will differ in how they listen to them.  Doesn't matter.

 

14 minutes ago, frednork said:

For this to work the samples must be indistinguishable

 

Just a data file with a non-identifying name

 

14 minutes ago, frednork said:

without allowing analysis to prevent people using analysers to do the listening for them.

 

I thought we didn't believe measurements can hear better than humans :)   Let 'em analyse away

 

15 minutes ago, frednork said:

Of course the closer the tests are to the difference you expect the better.

 

I get worried about talk of expectations.   How can any expectation make a difference if the test is blind.  You have to HEAR the difference to identify something reliably.  If you lie, you will get caught out.

 

17 minutes ago, frednork said:

Sure, what would you offer as an alternative to describe that situation?

 

 

 They are participants and they provide a contribution to the study.  You might say  they are successful if they complete the test?   or that the test was successful if it was completed (no matter what the results).

Link to post
Share on other sites
16 minutes ago, aussievintage said:

I thought we didn't believe measurements can hear better than humans :)   Let 'em analyse away

 

Ummm. No. This does not control the for the variable in question (ie sound).  In allowing a visualisation of the waveform(s) you are allowing test subjects to form judgements based on a variable (the visual waveform) other than the one you are testing. This would introduce huge bias in your results.

 

16 minutes ago, aussievintage said:

I get worried about talk of expectations.   How can any expectation make a difference if the test is blind.  You have to HEAR the difference to identify something reliably.  If you lie, you will get caught out.

 

Most studies start with a declared expectation in their abstract.  This paves the way for the hypothesis to be formed and tested against a null hypothesis.  This flows into the methodology where an examiner might state to a test subject “I’m going to play you 2 passages of music, tell me if you hear a difference”.  This creates an implication of possible difference, and expectation bias.  Generally, in listening tests, training helps remove the expectation bias through repeated exposures to audibly different sounds and sometimes the same sounds.

 

16 minutes ago, aussievintage said:

 They are participants and they provide a contribution to the study.  You might say  they are successful if they complete the test?   or that the test was successful if it was completed (no matter what the results).


“Failure” to discriminate a difference (when a difference actually exists) is reported as a null result.  “Successful” discrimination between two different stimuli is usually referred to as a positive result.  It is not routine to indicate to a test subject anything about their performance.

Link to post
Share on other sites
15 hours ago, Stereophilus said:

Ummm. No. This does not control the for the variable in question (ie sound).  In allowing a visualisation of the waveform(s) you are allowing test subjects to form judgements based on a variable (the visual waveform) other than the one you are testing. This would introduce huge bias in your results.

 

Yeah I know.  I just get pissed with people who just HAVE to cheat.  It's not in the spirit of what is trying to be achieved here.  The only option I see is some form of DRM or encryption, and if that's deemed  necessary I give up.

 

15 hours ago, Stereophilus said:

“I’m going to play you 2 passages of music, tell me if you hear a difference”.

 

They might be different, they might not.  Hard to have an expectation when you don't know.

15 hours ago, Stereophilus said:

Most studies start with a declared expectation in their abstract.  

 

I would state the objective of the test, without any expectation.  No need to bias the reader.

15 hours ago, Stereophilus said:

It is not routine to indicate to a test subject anything about their performance.

 

Exactly, and certainly they would never be led to think they have failed.  They should not even have the concept on their minds when participating.

Link to post
Share on other sites

Guest Eggcup the Dafter
On 24/04/2021 at 10:48 AM, rocky500 said:

When I was doing all my BT tests for months, I was reading a lot on the subject and did find Alan Shaw's and his Harbeth forum a wealth of information on the topic of DBT.

He has been doing them all his time in speaker building, so I would class him a bit of expert in the topic.

It opened my mind to why when I did them, everything sounded the same. This was mainly when the differences where smaller ones.

 

Now as I try and find the info online it seems the forum has changed since and the info is no longer there. I did post on here (SNA) about it and can not find that either.

 

So from my memory here is some points I remember that seemed important to Alan. Take it for what it is as I can no longer find the references. As hearsay as it's just my recollection.

 

It was important that the change over between listening to each be for at most 2 seconds, no longer, but the quicker the better. Msecs is the best. The closer to 2 seconds the harder to hear differences.

State of mind is important. Relaxed and rested as Alan mentioned he would walk away from the tests many days as he was not in the correct mindset.

Fatigue can come on quickly too, where it can again, all blur together.

He mentioned to concentrate on one aspect. Like a voice, instrument etc and make the changes quickly as not listen for too long in each segment. He mentioned he never listened to a whole song but kept it very short between swaps.

 

It was something to do with the mind being very good at filling in gaps, so hearing smaller differences is quite hard to do even for him in his tests.

All the reading I did at the time lead me to realization that DBT's where not for me as I was getting no where with my flawed tests. I would get mind fatigue very quickly and it was not like I normally would sit there and listen to music but with these tests, you have to concentrate right in on a single aspect.

 

I'm sure there was a lot more info from Alan, but that was my main takeaway that I recall at the moment.

 

 

 

 

Your memory's good. I think we argued a lot in that thread and I certainly learnt from what you had to say. The only thing I would add is that the consensus time for swapping is around 10 seconds maximum - I think Alan Shaw had a shorter time but I don't recall 2 seconds myself. I vaguely remember that the tests he did were also single blind rather than double - for his purposes that was fine, as he knew exactly what he was testing anyway and this wasn't a mass study but a design aid.

 

I've been surprised how many equipment designers admit to using sighted testing for the same purposes - but it doesn't seem to trip at least some of them up.

Link to post
Share on other sites
1 minute ago, Eggcup the Dafter said:

The only thing I would add is that the consensus time for swapping is around 10 seconds maximum - I think Alan Shaw had a shorter time but I don't recall 2 seconds myself.

 

If I am listening for something that's hard to pick out,   the best way for me, is to have two recordings cued some seconds apart, and flick from one to the other - immediately rehearing the same music I just listened to a few seconds before.   I think 2 seconds might be a bit short for this, and 10 seconds, too long.  May depend on the person.

Link to post
Share on other sites

Looks like we lost some content with the hacking incident.

 

As posted before.

If you are wanting to build your listening acuity, These tests might help a bit https://www.audiocheck.net/blindtests_index.php

 

I will be interested to see if I improve. This set I did the other day was done on cheap earphones. These are my results so far. Interested in what you all get and if you improve. Also interested in other similar tests you may know of.

 

Take up the challenge

Find the smallest difference in sound levels you can detect. 
The Level Series:  6dB  3dB  1dB  0.5dB  0.2dB  0.1dB 

.5db was tough

This was just sad, couldnt even do 10k. I hope its the phones

  • Find the smallest difference in pitch (frequency) you can hear. 
    The Pitch Series:  50c  20c  10c  5c  2c  1c 

2c was tough

  • Find the shortest timing difference you can reliably hear. NEW
    The Timing Series:  1ms  2ms  5ms  10ms  20ms  50ms  100ms 

10ms was not easy

  • Find the highest dynamic range offered by your listening environment. 
    The Dynamic Range Series:  36dB  42dB  48dB  54dB  60dB  66dB  72dB  78dB 

could do 54 but not 60

  • Do you have the absolute hearing ability? 
    The Perfect Pitch Blind Test:  C Scale  Chromatic 

yes but only with a reference

  • Are your ears sensitive to Absolute Phase? 
    The Absolute Polarity Blind Test:  Here 

This surprised me as i seemed able but I was never really sure when answering

They say "Can you hear the difference? Of course... you can't"

it seemed pretty easy to me, interested in your experience

Link to post
Share on other sites
On 25/04/2021 at 10:30 AM, aussievintage said:

 

If I am listening for something that's hard to pick out,   the best way for me, is to have two recordings cued some seconds apart, and flick from one to the other - immediately rehearing the same music I just listened to a few seconds before.   I think 2 seconds might be a bit short for this, and 10 seconds, too long.  May depend on the person.

I did find the Forum I was looking for and Alan Shaw is pretty much suggesting instantaneous switch overs is really the best way to try and hear small changes.

 

Quote

"The proper evaluation of all and every audio component (or recording) really should be done under instantaneous conditions or the outcome is, at best, questionable. It's satisfying turning in at the end of the day assured that you have done the best job possible and can truly justify why. I do understand that to consumers disconnected from the mysterious design validation process this seems a cruel, cold, unemotional way of comparing A with B. And maybe it is. I doubt that one audio designer in ten uses such a method because the outcome is rarely flattering to his hard work."

 

 

Link to post
Share on other sites
3 hours ago, rocky500 said:

I did find the Forum I was looking for and Alan Shaw is pretty much suggesting instantaneous switch overs is really the best way to try and hear small changes.

 

Quote

"The proper evaluation of all and every audio component (or recording) really should be done under instantaneous conditions or the outcome is, at best, questionable. It's satisfying turning in at the end of the day assured that you have done the best job possible and can truly justify why. I do understand that to consumers disconnected from the mysterious design validation process this seems a cruel, cold, unemotional way of comparing A with B. And maybe it is. I doubt that one audio designer in ten uses such a method because the outcome is rarely flattering to his hard work."

Personally I think quick AB switching is useful only for some aspects of playback reproduction.  I think limiting a DBT to only this methodology risks neglecting other important facets of music reproduction.

Link to post
Share on other sites
52 minutes ago, Stereophilus said:

Personally I think quick AB switching is useful only for some aspects of playback reproduction.  I think limiting a DBT to only this methodology risks neglecting other important facets of music reproduction.

 

What are those facets that cannot be heard in an AB test then?

Link to post
Share on other sites

 

I think DBT's  are a minefield in themselves. and I really don't trust them. Consider these scenarios:

 

1.  Where  DBT is being conducted between 2 pieces of equipment with the switch being flicked consistently between them.  In this example. the listeners are not hearing the identical music  from each piece of equipment. Rather they are hearing it sequentially at different stages.   There is no way that this can be definitive no matter how blind the the audience is.  It is a nice trick to downplay differences between equipment.  Only by  comparing  the same passages, can  an accurate comparison be made.  It would be preferable  to delay one playback stream by a few seconds and switching between them. If that's not possible, then same passage must be played separately.

 

2.  Dolby laboratories used DBT to "prove" that their original  DD  sounded as good as the higher bitrate DTS.    I had DVD's with both DD and DTS soundtracks  and DTS consistently fared better , particularly when the there was more  complexity and range in the sound ( but many passages were indistinguishable). Dolby simply got a bunch of random people together and played selected tracks in a  DBT using DD and DTS  to get the result they wanted. 

 

3. Conducting DBT using random subjects  is a good  trick if you are trying  to prove  that there are negligible differences between the subject matter.  Firstly  not everyone  has the same hearing acuity . Secondly  the degree of change that people rate as "different" or 'better" varies between individuals . I have been present at demos where some hear significant improvements and others thought the differences were  negligible. Who is right? well they all are if you think about it and stacking the numbers in a pretend scientific experiment wont change anything for the actual audiophiles who are making the decisions. 

 

Edited by TP1
Link to post
Share on other sites
1 hour ago, Stereophilus said:

Personally I think quick AB switching is useful only for some aspects of playback reproduction.  I think limiting a DBT to only this methodology risks neglecting other important facets of music reproduction.

Quick switching is there to give the highest chance of hearing differences

Link to post
Share on other sites

 

1 hour ago, Stereophilus said:

Personally I think quick AB switching is useful only for some aspects of playback reproduction.  I think limiting a DBT to only this methodology risks neglecting other important facets of music reproduction.

 

Quick switching may or may not be important.

 

I find for steady mono tone tests through headphones like the ones in the above link then it may be useful to easily compare, but I cant help think it doesnt work as well when comparing the kinds of subtleties in a stereo recording of music  that we are thinking about.

 

In the end it doesnt matter as the person driving the test should be striving to determine a listening scenario where it is evident and reliably picked. It doesnt really matter if its quick or otherwise as long as it is blinded and (on a practical level) achievable by at least some people. One person would do for a start.

  • Like 1
Link to post
Share on other sites
4 hours ago, aussievintage said:

What are those facets that cannot be heard in an AB test then?

 

3 hours ago, sir sanders zingmore said:

Quick switching is there to give the highest chance of hearing differences


So, personally I find quick AB switching useful for detecting tonal differences, dynamics and detail.  Quick switching definitely has use for those facets of music reproduction, assuming the brief playback music used can highlight all those things.

 

I do find that I need several minutes of continuous music playback before I can properly judge other facets of music reproduction, such as soundstage, imaging, attack/decay and natural rhythm.  Please be aware, this is just my experience and opinion.  Others may be able to judge some of these facets rapidly with quick AB switching, which is fine.  But, if I can’t, I bet quite a few others can’t either.  And if we are designing a DBT to test a group of random subjects, and some or all can’t hear important facets of music reproduction when there is quick AB switching then the DBT has limited value and limited validity.

Link to post
Share on other sites

Strangely enough, I listen to music for the response in me.

 

The way that I check for "better" is:

Am I enjoying the music more?

Do I want to turn it up or put on more? 

Am I listening to more music overall? 

How often do I give up whatever else I'm doing to just listen?

Do I get carried away?

Do I stay up too late for "just a bit more"?

 

This usually takes a month or so. 2 seconds just won't do.

 

This doesn't preclude a DB trial of course, but it would sure be time consuming.

 

YMMV.

  • Like 4
Link to post
Share on other sites
7 minutes ago, GregWormald said:

Strangely enough, I listen to music for the response in me.

 

The way that I check for "better" is:

Am I enjoying the music more?

Do I want to turn it up or put on more? 

Am I listening to more music overall? 

How often do I give up whatever else I'm doing to just listen?

Do I get carried away?

Do I stay up too late for "just a bit more"?

 

This usually takes a month or so. 2 seconds just won't do.

 

This doesn't preclude a DB trial of course, but it would sure be time consuming.

 

YMMV.

"Better" is a difficult blind test as everyones definition of better needs to be the same . "Different" is more straightforward in a blind test. How long do you think it would take to determine if is different?

Link to post
Share on other sites
Just now, frednork said:

"Better" is a difficult blind test as everyones definition of better needs to be the same . "Different" is more straightforward in a blind test. How long do you think it would take to determine if is different?

Some times it's fast—seconds, sometimes slow—weeks, depending on the magnitude of the difference; but I'm only interested in better or worse for me and I've been fooled by short term preferences before.

  • Like 2
Link to post
Share on other sites

P.S. Double blind tests can be unreliable.

 

Years ago I was put on a drug by the doctor. It was approved by all—the government, the doctors, the medical journals—effective and very safe. Except one web site that said "beware". I checked all over and decided to believe the exception.

 

Two years later it was banned all over the world for causing heart attacks. Oops! Lots of double blind trials didn't find the problem.

Link to post
Share on other sites

In theory you could take as long as you like between switching. But if you fail to hear a difference with “slow switching” then you can say the test was not done properly (as the research says that auditory memory is poor and so you should do fast switching). If you fail to hear a difference with fast switching then you can claim that’s not your “natural” way of listening. 



Whaddaya do? Calibrate your own auditory memory first? Test both short term and long term switching and see which one you are better at?

 

It’s a double blind bind. 

 

Link to post
Share on other sites
5 minutes ago, GregWormald said:

P.S. Double blind tests can be unreliable.

 

Years ago I was put on a drug by the doctor. It was approved by all—the government, the doctors, the medical journals—effective and very safe. Except one web site that said "beware". I checked all over and decided to believe the exception.

 

Two years later it was banned all over the world for causing heart attacks. Oops! Lots of double blind trials didn't find the problem.

I imagine that’s a sample size issue and has nothing whatsoever to do with unreliability of double blind tests. 

Link to post
Share on other sites
7 hours ago, sir sanders zingmore said:

Quick switching is there to give the highest chance of hearing differences

 

It's not very useful if you don't listen to the identical passage  in both instances.

  • Like 1
Link to post
Share on other sites
2 minutes ago, TP1 said:

 

It's not very useful if you don't listen to the identical passage  in both instances.

 

well the research indicates that it is the best way to do it. But you are free to start over and listen to the identical passage if that helps you to pick differences, no-one will stop you

Link to post
Share on other sites
5 hours ago, sir sanders zingmore said:

well the research indicates that it is the best way to do it.

 

 

What research? There is nothing scientific about that.  The comparison means that you are comparing sounds that are 100% different each time you flick the switch.  There are some differences that may seem obvious using that method but it is very much inadequate if the object is to get a true 1:1 comparison. 

 

I have seen this technique used to unfairly  convince people there are no differences between  two  items of equipment.  I even did it as an experiment with a couple of people while digitising a vinyl record using DSD 128 .  At the flick of a switch at the preamp, I could alternate between the original record output and that of the digitised output.   Using this method,  it was very hard to pick up any differences  and those listening  couldn't do so during that session.  However, once they were played back separately, the differences were obvious. 

 

 Apart from not being able to compare apples with apples, I think  the continuation of the melody and beat of the music  (being constant when the switch is flicked)  can cloud out other considerations. 

 

 

 

 

 

 

Edited by TP1
  • Like 1
Link to post
Share on other sites

I feel that this thread, which I thought started out to discuss the best way to do a DBT at home, is now discussing, yet again (unfortunately) whether DBTs are a valid testing method at all.     

 

Link to post
Share on other sites
1 hour ago, aussievintage said:

I feel that this thread, which I thought started out to discuss the best way to do a DBT at home, is now discussing, yet again (unfortunately) whether DBTs are a valid testing method at all.     

 

Perhaps it would help if there was some concrete discussion on how to set up equipment so that the two versions of the sound being compared are:

 

1. matched for playing level (e.g. with the aid of a 1kHz test tone) 

2. easily and quickly able to be switched.

 

For example, in practice how would you go about the above if trying to compare the performance of two different power amplifiers, driving one set of speakers?

 

Or, how , in practice, could you set up a comparison of two DACs?

Link to post
Share on other sites
Guest Eggcup the Dafter
11 hours ago, sir sanders zingmore said:

I imagine that’s a sample size issue and has nothing whatsoever to do with unreliability of double blind tests. 

More often, a sample type issue. You'll find that lots of drug tests skip older people, pregnant and menopausal women, people with various conditions.

 

Rare responses may be discounted during the test phase as well ("not caused by our shiny new product", or "at the same rate as in the general population" when the study sample isn't itself representative). Then, the drug released to the public can actually be different to the tested one (different filler, coating, produced by a different method so leaving traces of other chemicals).

Link to post
Share on other sites
7 minutes ago, MLXXX said:

Perhaps it would help if there was some concrete discussion on how to set up equipment so that the two versions of the sound being compared are:

 

1. matched for playing level (e.g. with the aid of a 1kHz test tone) 

2. easily and quickly able to be switched.

 

For example, in practice how would you go about the above if trying to compare the performance of two different power amplifiers, driving one set of speakers?

 

Or, how , in practice, could you set up a comparison of two DACs?

 

 

I started a suggestion earlier, asking about the method of  supplying the test cases as recorded audio files, so participants can play them back in the way they prefer.   

 

  • Starting them in simultaneous tracks in a free program like Audacity enables fast AB switching. 
  • Starting them at staggered times allows AB switching to hear the same portion of track for comparison
  • Audacity can normalise them easily too (but preferably they would be supplied that way)
  • Listening to them one at a time can also be done of course.
  • repeat as desired.
Link to post
Share on other sites
2 hours ago, aussievintage said:

I feel that this thread, which I thought started out to discuss the best way to do a DBT at home, is now discussing, yet again (unfortunately) whether DBTs are a valid testing method at all.     

Not doubting the validity of DBT generally, just doubting the validity of rapid switching of different equipment as method for detecting audible differences.  In my view, rapid switching as a methodology has only limited validity to certain facets of music reproduction.  
 

Now, it’s fine to set up your DBT and test with rapid switching, but the conclusions you draw need to be limited to what is audibly different when you use rapid switching.  For instance, let’s say you set up a DBT of DAC A vs DAC B and compare them using rapid switching.  Your study finds that participants cannot statistically pick any difference between them.  Your conclusion must be “there is no statistically audible difference between DAC A and DAC B within the limitations of rapid switching”.  You cannot state “there is no audible difference between DAC A and DAC B” without the qualification.

  • Like 1
Link to post
Share on other sites
  • Recently Browsing   0 members

    No registered users viewing this page.



×
×
  • Create New...