Using wavelets and Gaussian Mixture Models for audio classification

Ching-Hua Chuan, Susan Vasana, Asai Asaithambi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012
Pages421-426
Number of pages6
DOIs
StatePublished - Dec 1 2012
Externally publishedYes
Event14th IEEE International Symposium on Multimedia, ISM 2012 - Irvine, CA, United States
Duration: Dec 10 2012Dec 12 2012

Other

Other14th IEEE International Symposium on Multimedia, ISM 2012
CountryUnited States
CityIrvine, CA
Period12/10/1212/12/12

Fingerprint

Acoustics
Audio recordings
Discrete wavelet transforms
Acoustic waves
Decomposition

Keywords

  • Audio classification
  • Gaussian Mixture Models
  • Wavelets

ASJC Scopus subject areas

  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Software

Cite this

Chuan, C-H., Vasana, S., & Asaithambi, A. (2012). Using wavelets and Gaussian Mixture Models for audio classification. In Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012 (pp. 421-426). [6424700] https://doi.org/10.1109/ISM.2012.86

Using wavelets and Gaussian Mixture Models for audio classification. / Chuan, Ching-Hua; Vasana, Susan; Asaithambi, Asai.

Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012. 2012. p. 421-426 6424700.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chuan, C-H, Vasana, S & Asaithambi, A 2012, Using wavelets and Gaussian Mixture Models for audio classification. in Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012., 6424700, pp. 421-426, 14th IEEE International Symposium on Multimedia, ISM 2012, Irvine, CA, United States, 12/10/12. https://doi.org/10.1109/ISM.2012.86
Chuan C-H, Vasana S, Asaithambi A. Using wavelets and Gaussian Mixture Models for audio classification. In Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012. 2012. p. 421-426. 6424700 https://doi.org/10.1109/ISM.2012.86
Chuan, Ching-Hua ; Vasana, Susan ; Asaithambi, Asai. / Using wavelets and Gaussian Mixture Models for audio classification. Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012. 2012. pp. 421-426
@inproceedings{eccfa07b9b8b4b06bb5002289b916865,
title = "Using wavelets and Gaussian Mixture Models for audio classification",
abstract = "In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.",
keywords = "Audio classification, Gaussian Mixture Models, Wavelets",
author = "Ching-Hua Chuan and Susan Vasana and Asai Asaithambi",
year = "2012",
month = "12",
day = "1",
doi = "10.1109/ISM.2012.86",
language = "English (US)",
isbn = "9780769548753",
pages = "421--426",
booktitle = "Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012",

}

TY - GEN

T1 - Using wavelets and Gaussian Mixture Models for audio classification

AU - Chuan, Ching-Hua

AU - Vasana, Susan

AU - Asaithambi, Asai

PY - 2012/12/1

Y1 - 2012/12/1

N2 - In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.

AB - In this paper, we present an audio classification system using wavelets for extracting low-level acoustic features. We perform multiple-level decomposition using Discrete Wavelet Transform to extract acoustic features at different scales and time from audio recordings. The extracted features are then translated into a compact vector representation. Gaussian Mixture Models with Expectation Maximization algorithm are then used to build models for sound classes. Specifically, three types of audio classification tasks are designed to evaluate the system, including speech/music classification, male/female speech classification, and music genre (classical, pop, jazz, and electronic) classification. By evaluating the system through 5-fold cross validation, the experimental result shows the promising capability of wavelets for speech and music analyses.

KW - Audio classification

KW - Gaussian Mixture Models

KW - Wavelets

UR - http://www.scopus.com/inward/record.url?scp=84874231378&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84874231378&partnerID=8YFLogxK

U2 - 10.1109/ISM.2012.86

DO - 10.1109/ISM.2012.86

M3 - Conference contribution

SN - 9780769548753

SP - 421

EP - 426

BT - Proceedings - 2012 IEEE International Symposium on Multimedia, ISM 2012

ER -