Emotion recognition in speech using inter-sentence glottal statistics

Alexander I. Iliev, Michael S Scordilis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

This study deals with the recognition of three emotional states in speech, namely: Happiness, Anger, and Sadness. The corpus included speech from six subjects (3M and 3F) speaking ten sentences. Glottal inverse filtering was first performed on the spoken utterances. Then parameters for computing the Glottal Symmetry were collected and computed to create a final matrix of features. A combined with all emotions across the different subjects was formed and used to train a Gaussian Mixture Model (GMM) classifier. Training on 80% of all combined utterances for each emotion was performed. Testing was administered on the remaining 20%. The system shows confidence that glottal information may be used for determining the correct emotion in speech. The recognition performance varied between 48.96% and 82.29%.

Original languageEnglish
Title of host publicationProceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing
Pages465-468
Number of pages4
DOIs
StatePublished - Oct 6 2008
Event15th International Conference on Systems, Signals and Image Processing, IWSSIP 2008 - Bratislava, Slovakia
Duration: Jun 25 2008Jun 28 2008

Other

Other15th International Conference on Systems, Signals and Image Processing, IWSSIP 2008
CountrySlovakia
CityBratislava
Period6/25/086/28/08

Fingerprint

Statistics
Classifiers
Testing

Keywords

  • Emotion recognition
  • Glottal symmetry
  • Glottal waveform
  • GMM
  • Pattern classification
  • Speech

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Signal Processing
  • Control and Systems Engineering

Cite this

Iliev, A. I., & Scordilis, M. S. (2008). Emotion recognition in speech using inter-sentence glottal statistics. In Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing (pp. 465-468). [4604467] https://doi.org/10.1109/IWSSIP.2008.4604467

Emotion recognition in speech using inter-sentence glottal statistics. / Iliev, Alexander I.; Scordilis, Michael S.

Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing. 2008. p. 465-468 4604467.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Iliev, AI & Scordilis, MS 2008, Emotion recognition in speech using inter-sentence glottal statistics. in Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing., 4604467, pp. 465-468, 15th International Conference on Systems, Signals and Image Processing, IWSSIP 2008, Bratislava, Slovakia, 6/25/08. https://doi.org/10.1109/IWSSIP.2008.4604467
Iliev AI, Scordilis MS. Emotion recognition in speech using inter-sentence glottal statistics. In Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing. 2008. p. 465-468. 4604467 https://doi.org/10.1109/IWSSIP.2008.4604467
Iliev, Alexander I. ; Scordilis, Michael S. / Emotion recognition in speech using inter-sentence glottal statistics. Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing. 2008. pp. 465-468
@inproceedings{337a05ae376048cbba662cd63b0e28c2,
title = "Emotion recognition in speech using inter-sentence glottal statistics",
abstract = "This study deals with the recognition of three emotional states in speech, namely: Happiness, Anger, and Sadness. The corpus included speech from six subjects (3M and 3F) speaking ten sentences. Glottal inverse filtering was first performed on the spoken utterances. Then parameters for computing the Glottal Symmetry were collected and computed to create a final matrix of features. A combined with all emotions across the different subjects was formed and used to train a Gaussian Mixture Model (GMM) classifier. Training on 80{\%} of all combined utterances for each emotion was performed. Testing was administered on the remaining 20{\%}. The system shows confidence that glottal information may be used for determining the correct emotion in speech. The recognition performance varied between 48.96{\%} and 82.29{\%}.",
keywords = "Emotion recognition, Glottal symmetry, Glottal waveform, GMM, Pattern classification, Speech",
author = "Iliev, {Alexander I.} and Scordilis, {Michael S}",
year = "2008",
month = "10",
day = "6",
doi = "10.1109/IWSSIP.2008.4604467",
language = "English",
isbn = "9788022728560",
pages = "465--468",
booktitle = "Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing",

}

TY - GEN

T1 - Emotion recognition in speech using inter-sentence glottal statistics

AU - Iliev, Alexander I.

AU - Scordilis, Michael S

PY - 2008/10/6

Y1 - 2008/10/6

N2 - This study deals with the recognition of three emotional states in speech, namely: Happiness, Anger, and Sadness. The corpus included speech from six subjects (3M and 3F) speaking ten sentences. Glottal inverse filtering was first performed on the spoken utterances. Then parameters for computing the Glottal Symmetry were collected and computed to create a final matrix of features. A combined with all emotions across the different subjects was formed and used to train a Gaussian Mixture Model (GMM) classifier. Training on 80% of all combined utterances for each emotion was performed. Testing was administered on the remaining 20%. The system shows confidence that glottal information may be used for determining the correct emotion in speech. The recognition performance varied between 48.96% and 82.29%.

AB - This study deals with the recognition of three emotional states in speech, namely: Happiness, Anger, and Sadness. The corpus included speech from six subjects (3M and 3F) speaking ten sentences. Glottal inverse filtering was first performed on the spoken utterances. Then parameters for computing the Glottal Symmetry were collected and computed to create a final matrix of features. A combined with all emotions across the different subjects was formed and used to train a Gaussian Mixture Model (GMM) classifier. Training on 80% of all combined utterances for each emotion was performed. Testing was administered on the remaining 20%. The system shows confidence that glottal information may be used for determining the correct emotion in speech. The recognition performance varied between 48.96% and 82.29%.

KW - Emotion recognition

KW - Glottal symmetry

KW - Glottal waveform

KW - GMM

KW - Pattern classification

KW - Speech

UR - http://www.scopus.com/inward/record.url?scp=52949149023&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=52949149023&partnerID=8YFLogxK

U2 - 10.1109/IWSSIP.2008.4604467

DO - 10.1109/IWSSIP.2008.4604467

M3 - Conference contribution

AN - SCOPUS:52949149023

SN - 9788022728560

SP - 465

EP - 468

BT - Proceedings of IWSSIP 2008 - 15th International Conference on Systems, Signals and Image Processing

ER -