Using wavelets and Gaussian Mixture Models for audio classification

Ching Hua Chuan, Susan Vasana, Asai Asaithambi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012
Pages421-426
Number of pages6
DOIs
StatePublished - Dec 1 2012
Externally publishedYes
Event14th IEEE International Symposium on Multimedia, ISM 2012 - Irvine, CA, United States
Duration: Dec 10 2012Dec 12 2012

Publication series

NameProceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012

Other

Other14th IEEE International Symposium on Multimedia, ISM 2012
CountryUnited States
CityIrvine, CA
Period12/10/1212/12/12

Keywords

  • Audio classification
  • Gaussian Mixture Models
  • Wavelets

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Software

Fingerprint Dive into the research topics of 'Using wavelets and Gaussian Mixture Models for audio classification'. Together they form a unique fingerprint.

  • Cite this

    Chuan, C. H., Vasana, S., & Asaithambi, A. (2012). Using wavelets and Gaussian Mixture Models for audio classification. In Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012 (pp. 421-426). [6424700] (Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012). https://doi.org/10.1109/ISM.2012.86