So why does reverberation affect
speech intelligibility?

Many factors influence speech intelligibility

We've all experienced the difficulty in understanding someone speaking in a reverberant environment; stairwells, squash courts, gymnasiums, arenas etc. We thought it might be useful to show you what this loss of intelligibility looks like. The waveform above is the phrase "Many factors influence speech intelligibility." You can see each distinct syllable of the words. The sharp spikes are the consonants where the information is actually contained.

Now let's take a look at that waveform dropped into a room with a fairly short 0.8 second reverberation time.

0.8 second reverb time

You can see the original speech syllables in red with the reverberant sound field in green. Note how the reverberant sound is stretching out between the syllables to fill in the gaps with noise. In this room, the sharp spikes of the consonants are still distinct and have not been masked by the reverberant sound field.

Have a listen: WAV File (158kB) / MP3 File (31kB)

Let's try a room with a longer reverberation time, here's a 1.3 second room with the same speech sample.

1.3 second reverb time

What you'll notice is that the reverberant sound level is now stretching out between the syllables and actually starting to mask some of the sharp spikes of the consonants. That means that some of the syllables are being buried or masked by the reverberant "noise". Depending on how far each new syllable is submerged into the reverberant noise, a listener will have varying degrees of difficulty in understanding those words. This is a bit like trying to listen to one person with a bunch of other people talking around you, it gets harder to pick out the sounds you want to hear from all the other conversations around you. The only difference here is that with the reverebrant sound field it is the same conversation repeated hundreds of times with a little bit of time offset.

Have a listen: WAV File (180kB) / MP3 File (35kB)

How bad can it get? Let's try a room with a 2 second reverb time.

2 second reverb time

Note that many new syllables are completely buried in the reverberant field. This would be a challenging room to understand what is being said. It is easy to see what has happened to speech intelligibility here, the distinct syllables and speech sounds are just swamped by the persistent sound field slowly decaying behind it. This also makes it easier to understand why you need to be closer to the sound source in a long decay time reverberant field, the closer you are to a sound source the louder those original sounds will be, and they have a better chance of sticking up above the background noise of the reverberant field.

Have a listen: WAV File (251kB)/ MP3 File (48kB)

