Illusions of Sound Perception

Stenzel · on Sept 16, 2019

There is no illusion of pitch, and it is a common misconception that the fundamental frequency must be present in a tone. Pitch is the perceived periodicity of a tone, which is roughly the greatest common divisor of the harmonics. If perceived pitch without fundamental is considered an auditory illusion with, common pitch detection techniques should fail if the fundamental is not present, but they work quite well in the absence of the fundamental. So either there is no illusion of pitch or algorithms have illusions too.

DoctorOetker · on Sept 16, 2019

the answer is: algorithms have illusions too.

consider 2 ideal harmonic notes with a frequency ratio of 3:2, say 3kHz and 2kHz ... The brain / algorithm must doubt between interpreting the collection of frequency peaks at m * 3kHz, and n * 2kHz as either (occasionally overlapping) harmonics of 2 notes at 2kHz and 3kHz, OR it could interpret this as harmonics of a single note at 1kHz (as you say the GCD of the frequencies).

There is inherent ambiguity between interpreting as 2 notes of each a timbre, vs interpreting as 1 note with another timbre...

One could physically construct 3 bowed strings with modekilling on the 1kHz string, such that these could make perceptually identical sounds whether the 2kHz and 3kHz strings are played simultaneously vs the 1kHz string.

at that point from the sound alone one can not discern in an ABX test which is the case, neither a human brain nor any algorithm. The doubt forces to guess (deterministically or not).

The sound is a projection of properties occuring in reality, and loses information.

Stenzel · on Sept 16, 2019

True, but ambiguity does not imply that one possible interpretation must necessarily be an illusion.

AstralStorm · on Sept 16, 2019

Tones and harmonics get clustered into pitches, e.g. mistuned harmonics as seen in bass guitar or piano still get decoded into pitches via some sort of best match if the mistuning does not exceed certain percentage. And it works even if some harmonics disappear and reappear.

The pitch is higher level than purely perceptual.

DoctorOetker · on Sept 16, 2019

this is correct, and the reason we are tolerant is because of dispersion: even though the different harmonics are present on the same string of the same length, the resonant frequencies don't need to be integer multiples of the fundamantal since waves of different frequency have different propagation speeds on the string.

in the case of bowed strings mode-locking ensures the phases of all the harmmonics are reset each cycle (the bow sticks and slips), so that bowed instruments can be played harmonically to parts per billion.

since a lot of sounds are plucking we must be tolerant for frequency dependent propagations speeds in regular strings / media

DoctorOetker · on Sept 16, 2019

In the case of either the 2 strings being bowed vs the 1 string, there is an actual underlying reality, that can not be deduced from the limited information available in the sound, so any guess risks being an illusion (with probability 50%).

Assuming we agree that "illusion" merely means mismatch between interpretation and reality.

jacquesm · on Sept 16, 2019

Mixing two frequencies leads to 'sum' and 'difference' frequencies.

https://en.wikipedia.org/wiki/Frequency_mixer

It doesn't quite work that way with audio but the effects are close enough.

DoctorOetker · on Sept 16, 2019

yes that is a second type of ambiguity, and it does occur in audio as well:

an lower frequenncy sinusoidally amplitude modulated higher frequency sinusoid can be indistinguishable from 2 constant amplitude sinusoids at the sum and difference of the frequencies.

see an article by Plomp and Leveldt for the determination of the bandwidth of the auditory frequency bins (or filter bank)

munificent · on Sept 16, 2019

I think the psychoacoustic term for this is "combination tones":

https://en.wikipedia.org/wiki/Combination_tone

whiddershins · on Sept 16, 2019

I think you’re kinda right ... but maybe a better way to say it is something like:

Pitch can’t be termed an illusion, because it is a perception, not a fact of physics. So it is either always an illusion or never an illusion.

Frequency is a fact in the realm of physics. Pitch is something our mind labels things.

funciton · on Sept 16, 2019

The missing fundamental illusion persists when the harmonics are split across different ears, and whether it is perceived or not is highly subjective.

That's why it's widely accepted to be an illusion that arises in the brain's auditory center.

dr_dshiv · on Sept 16, 2019

The Shepard tone illusion, of an ever rising pitch, is used in the movie Dunkirk

https://www.businessinsider.com/dunkirk-music-christopher-no...

rrss · on Sept 16, 2019

And the dark knight films. Apparently Nolan is fond of it.

dawnerd · on Sept 16, 2019

And rightly so. It's incredibly effective at building suspense. Hans Zimmer's incredible scoring also helps.

sp332 · on Sept 16, 2019

Here is a video I enjoyed that explains the basics of how our sense of sound works. It makes it easier to understand why some of the illusions happen. https://vimeo.com/147902575

holy_city · on Sept 16, 2019

Typo in the video (I think, not an anatomist), it's "basilar" membrane, not "vasilar." Awesome video though, I wish my speech processing professor had used that instead of teaching hearing like a filterbank, even if that is how we needed to understand it.

Another weird thing about hearing: the hairs that vibrate aren't just tuned to particular frequencies, they actually vibrate over a range, and the response isn't symmetric (although iirc, part of that is from the fact the hairs are mechanically coupled). That's why low frequency noise masks high frequency noise more than vice versa, which is exploited in lossy codecs (if there's low frequency energy, you don't need high frequency energy that it masks).

sp332 · on Sept 16, 2019

That correction is in the video description, so yeah.

tony · on Sept 16, 2019

For more: https://en.wikipedia.org/wiki/Auditory_scene_analysis

This book Auditory Scene Analysis: The Perceptual Organization of Sound by Albert S. Bregman has more.

More foundational info on how we "fill in" information we see / hear https://en.wikipedia.org/wiki/Gestalt_psychology#Properties

carapace · on Sept 16, 2019