It’s a Saturday night and you are going out. You can hear the bass beat a block before you get to the party. As you walk into the house, you are surrounded by people talking, the clinking of glasses and the ever-present music. Waving at you, the host appears and asks “What would you like to drink?”
It’s a common enough opening, and the interesting part is not the drink of choice, but how you could understand the question.
Every day, in buses, trains, crowded restaurants and open plan offices, most people easily understand other people talking. It hardly seems exciting and yet it has fascinated scientists for over half a century – ever since Cherry first described “the cocktail party problem”. The mechanisms underlying this talent are still not fully understood.
Why is it such a mystery?
The ear hears by collecting soundwaves and passing them along the ear canal, though a series of small bones, to the inner ear, where the soundwaves vibrate the basilar membrane (a tiny organ curled inside the spiral cochlear). The basilar membrane tapers along its length and different parts vibrate depending on the frequency of the sound waves. The narrow end vibrates in response to high frequencies (high pitch sounds), and the wide end to low frequencies (low pitch sounds). Hair cells pick up the vibrations and pass them along into the brain.
Since sounds are comprised of a number of frequencies (the average human voice ranges from the buzz of a mosquito to the highest note on a piccolo), all sounds sum to create the vibrations. Thus your brain knows only which frequencies are present in a room of noise, not which frequencies belong to the person you are trying to hear.
So how can we isolate just one person?
The tempo of their speech is different from that of the music, the conversation flows unlike the sporadic clink of glasses; they may have a lovely accent or a deep voice. Similarly hand gestures, facial expressions and lip movements also add to the experience. All of these speech cues can help you work out which sounds belong to the person you are listening to and which to unwanted intrusions.
Expectation is a factor. When someone is speaking about the latest football results, you can safely ignore any words and phrases relating to politics.
Having two ears also helps. Speech coming from a person on your left side will hit your left ear first and be louder in that ear than in the right ear. The brain can then use that time and volume difference to separate frequencies belonging to that person from those belonging to someone standing on your right.
But all this is done without thought. Can you consciously help?
How often have you missed a sentence because you “weren’t paying attention”? Concentrating on a person makes it easier to follow them. But when you concentrate on understanding a person, are you attending to their accent? their location? the cadence of their speech? or a mix of everything? Experiments are currently trying to work out which factors you can attend to and how this attention can make understanding easier. For example, do you increase your brain responses to speech when you are paying attention or do you just put more resources into interpreting the responses you subconsciously receive?
Who cares HOW you understand since it’s so easy?
If you don’t already, you will. As you age you will slowly lose the ability to listen to a friend in a noisy environment. This is the greatest hearing-related complaint from the aged and from hearing aid users.
Until we understand how a person with normal hearing can solve the cocktail party problem, we cannot work out what is going wrong as people age, nor can we design better hearing aids or assist those with impairments.
So next time you are offered a drink at a noisy party, spare a thought for your auditory system. Chances are its miracles are being wasted on acquiring luke-warm chardonnay.
This is the topic of my thesis, so to date I have well over 200 references. They would not all fit here, so I shall limit the list to the original paper that started the scientific debate:
Cherry EC (1953) Some experiments on the recognition of speech, with one and with two ears. J. Acoust. Soc. Am. 25: 975-979