Prosody and emotion: Towards the development of an emotional agent Emotional evaluation of news reports: production and perception experiments

Detta är en Master-uppsats från Göteborgs universitet / Institutionen för filosofi, lingvistik och vetenskapsteori

Författare: Liina Tumma; [2022-01-20]

Nyckelord: emotion; speech; prosody;

Sammanfattning: There is a recognised need for more research on the topic of emotion recognition from speech, and clear and defined methodology in this area is still lacking. Most studies in the field of emotional speech recognition and classification usually focus on acted speech as the data source; consequently, other methods that capture more natural speech are left aside. This study presents a novel perspective on corpus collection and emotion classification technique. The emotion from the perspective of an evaluation device is also highlighted. The aim of the study is to investigate the possibility to evoke happy, neutral and sad emotions from news reports, and to analyse the acoustic predictors that play a crucial role in the prediction of these emotions. The thesis is based on three experiments: i. corpus collection by eliciting sad, happy and neutral emotional speech through news, and posterior statistical analysis of this data (Mixed-effect models); ii. automatic classification of these emotions by training Decision Tree (C5.0) classification models; iii. perception experiment to verify the findings from the previous experiments. Speech data obtained from 20 native speakers of Swedish is analysed. The participants were asked to summarize and give their personal opinion on 36 news reports about happy and sad events and read out loud 12 neutral Wikipedia short descriptions. To investigate emotion as an evaluation device, sad news reports are categorized following the Brandt Line division between Global North (developed countries) and Global South (developing countries). Results indicate that news reports are suitable to be used as stimuli to evoke emotional responses of Swedish speakers. Decision Tree (DT) classifier reached an average accuracy of 70.88% (tested on validation data from 10-fold cross-validation). Final velocity, relative location of the F0 peak, time of the F0 peak and mean intensity are crucial attributes for the classifier. The perception experiment has also proved that Swedish speakers are capable of identifying and classifying these emotions, although machine learning outperforms the human evaluation. The findings do not show any clear difference between South and North news reports and therefore no evidence regarding emotion as an evaluation device in case of South and North news is found. The findings can contribute to a better understanding of evaluation as a speech device and it also explores other possibilities regarding corpus collection and classification methods, such as using news reports as emotion stimuli and a Decision Tree algorithm for classification. The research results represent a further step towards developing an emotional agent.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)