Forschung Aktuell

 Science Service
Nr. 1, September 2000

TU Berlin Science Service of the TU Berlin
Nr. 1 / September 2000

Voice research
The smile that you can hear

Listeners can recognise the emotional state of a speaker. But what are the acoustic characteristics of the speech which express fear, happiness or sadness? Scientists at the TU Berlin have now analysed these traits and incorporated them in a computer-generated artificial voice.

When the German SPD was selecting a new leadership in 1995, Walter Sendlmaier could hear that Rudolf Scharping had no chance against Oskar Lafontaine. And something else came to his ear, namely the voice of Gerhard Schröder. He gave him the greatest chance of becoming the next chancellor. But the scientist wasn't studying the speeches out of political interest - he was more concerned with analysing the politicians' voices and the effects they had. The communications researcher at the Technical University (TU) Berlin investigates the way in which basic emotions such as happiness or fear are expressed, and the features of voice and expression which allow listeners to recognise these emotional states. It is not at all clear which acoustic characteristics allow us to identify the annoyance of Lafontaine, the resignation of Scharping, or the belligerence of Schröder. The Berlin Institute of Communications, Media and Musical Science is one of the few scientific institutions studying emotion in vocal expression.

The experiments involved actors in a sound studio repeating sentences with neutral contents in either a bored, sad, happy, disgusted, or frightened way. If at least 80% of a group of listeners assigned the intended emotion to a sentence, this sentence was analysed syllable by syllable. "We looked at features such as pitch, volume, basic frequency of the voice, and speed of talking, and also the accuracy of articulation, which has received very little attention in the past" commented Prof. Sendlmaier, whose research work is being funded by the DFG (Deutsche Forschungsgemeinschaft).

The scientists came up with some surprising results: Angry or happy people speak very quickly, and it would be natural to suppose that this involved leaving out or blurring syllables in words. But this is not at all the case when anger is concerned: "We not only speak quicker but also more clearly. In this state we emphasise very many syllables, which leads to improved articulation", explains the scientist.

In cases of boredom, sadness or fear it is possible to identify the opposite effect. "Although we speak more slowly in such cases, the syllables are articulated less clearly, because we generate the sounds with the lower jaw open at a smaller angle." Our body tenses up, and we can hardly open our mouth for fear. In contrast, when we are happy or angry we swing our arms, our thorax is pushed forward, and we open our mouth wide. As a result the words are articulated much more clearly.

There are also noticeable differences in pitch. In a state of fear the voice can rise by up to an octave, and we speak in a falsetto; at the same time the speech melody becomes more monotonous. The explanation for this is to be found in the activity of the muscles in the larynx, which control the vibrations of the vocal chords. In the state of anger they contract much more abruptly than with other emotions, and more energy goes into the higher overtones. These changes in the tone colour are clearly noticeable to the listener. In the case of sadness, however, the movements of the vocal chords are much gentler, and they often barely touch. The air escaping between them produces vortices, giving the voice a distinctive aspirated sound.

In order to test the results, the TU scientists programmed a computer-generated voice with these characteristics. If the test listeners are able to identify the correct emotional state for the artificial voice, "then we have probably found the right indicators" says the acoustic researcher. A sign of the potential of "smiles that you can hear" is the numbers of enquiries received, particularly from companies in the fields of automatic speech recognition and speech synthesis.

Database

Contact: Prof. Walter Sendlmaier, TU Berlin, Institute of Communications, Media and Musical Science
Special field: Verbal communication and phonetics
Research project: Phonetic reduction and elaboration during emotional expression (DFG Project Se 462/3-1)
Address: Einsteinufer 17, 10587 Berlin, Tel: +49 30 314-24503, E-mail: sendl@kgw.tu-berlin.de

Download as .rtf-file