Spoken emotion recognition through optimum-path forest classification using glottal features

Alexander I. Iliev, Michael S. Scordilis, João P. Papa, Alexandre X. Falcão

Research output: Contribution to journalArticlepeer-review

72 Scopus citations


A new method for the recognition of spoken emotions is presented based on features of the glottal airflow signal. Its effectiveness is tested on the new optimum path classifier (OPF) as well as on six other previously established classification methods that included the Gaussian mixture model (GMM), support vector machine (SVM), artificial neural networks - multi layer perceptron (ANN-MLP), k-nearest neighbor rule (k-NN), Bayesian classifier (BC) and the C4.5 decision tree. The speech database used in this work was collected in an anechoic environment with ten speakers (5 M and 5 F) each speaking ten sentences in four different emotions: Happy, Angry, Sad, and Neutral. The glottal waveform was extracted from fluent speech via inverse filtering. The investigated features included the glottal symmetry and MFCC vectors of various lengths both for the glottal and the corresponding speech signal. Experimental results indicate that best performance is obtained for the glottal-only features with SVM and OPF generally providing the highest recognition rates, while for GMM or the combination of glottal and speech features performance was relatively inferior. For this text dependent, multi speaker task the top performing classifiers achieved perfect recognition rates for the case of 6th order glottal MFCCs.

Original languageEnglish (US)
Pages (from-to)445-460
Number of pages16
JournalComputer Speech and Language
Issue number3
StatePublished - Jul 2010
Externally publishedYes


  • Emotion recognition
  • Glottal analysis
  • Optimum-path forest
  • Speech analysis

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Theoretical Computer Science


Dive into the research topics of 'Spoken emotion recognition through optimum-path forest classification using glottal features'. Together they form a unique fingerprint.

Cite this