Content-based music similarity search and emotion detection

Research output: Chapter in Book/Report/Conference proceedingConference contribution

56 Citations (Scopus)

Abstract

This paper investigates the use of acoustic based features for music information retrieval. Two specific problems are studied: similarity search (searching for music sound files similar to a given music sound file) and emotion detection (detection of emotion in music sounds). The Daubechies Wavelet Coefficient Histograms (proposed by Li, Ogihara, and Li), which consist of moments of the coefficients calculated by applying the Db8 wavelet filter, are combined with the timbral features extracted using the MARSYAS system of Tzanetakis and Cook, to generate compact music features. For similarity search, the distance between two sound files is defined to be the Euclidean distance of their normalized representations. Based on the distance measure the closest sound files to an input sound file is obtained. Experiments on Jazz vocal and Classical sound files achieve a very high level of accuracy. Emotion detection is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of Support Vector Machines trained on the extracted features. Our experiments on emotion detection achieved reasonably accurate performance and provided some insights on future work.

Original languageEnglish (US)
Title of host publicationICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume5
StatePublished - 2004
Externally publishedYes
EventProceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing - Montreal, Que, Canada
Duration: May 17 2004May 21 2004

Other

OtherProceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing
CountryCanada
CityMontreal, Que
Period5/17/045/21/04

Fingerprint

emotions
music
files
Acoustic waves
acoustics
information retrieval
Information retrieval
coefficients
Support vector machines
histograms
casts
Acoustics
Experiments
moments
filters

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Cite this

Li, T., & Ogihara, M. (2004). Content-based music similarity search and emotion detection. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (Vol. 5)

Content-based music similarity search and emotion detection. / Li, Tao; Ogihara, Mitsunori.

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 5 2004.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, T & Ogihara, M 2004, Content-based music similarity search and emotion detection. in ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. vol. 5, Proceedings - IEEE International Conference on Acoustics, Speech, and Signal Processing, Montreal, Que, Canada, 5/17/04.
Li T, Ogihara M. Content-based music similarity search and emotion detection. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 5. 2004
Li, Tao ; Ogihara, Mitsunori. / Content-based music similarity search and emotion detection. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Vol. 5 2004.
@inproceedings{f56ae0705a374ce3bf595fc077352c69,
title = "Content-based music similarity search and emotion detection",
abstract = "This paper investigates the use of acoustic based features for music information retrieval. Two specific problems are studied: similarity search (searching for music sound files similar to a given music sound file) and emotion detection (detection of emotion in music sounds). The Daubechies Wavelet Coefficient Histograms (proposed by Li, Ogihara, and Li), which consist of moments of the coefficients calculated by applying the Db8 wavelet filter, are combined with the timbral features extracted using the MARSYAS system of Tzanetakis and Cook, to generate compact music features. For similarity search, the distance between two sound files is defined to be the Euclidean distance of their normalized representations. Based on the distance measure the closest sound files to an input sound file is obtained. Experiments on Jazz vocal and Classical sound files achieve a very high level of accuracy. Emotion detection is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of Support Vector Machines trained on the extracted features. Our experiments on emotion detection achieved reasonably accurate performance and provided some insights on future work.",
author = "Tao Li and Mitsunori Ogihara",
year = "2004",
language = "English (US)",
volume = "5",
booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",

}

TY - GEN

T1 - Content-based music similarity search and emotion detection

AU - Li, Tao

AU - Ogihara, Mitsunori

PY - 2004

Y1 - 2004

N2 - This paper investigates the use of acoustic based features for music information retrieval. Two specific problems are studied: similarity search (searching for music sound files similar to a given music sound file) and emotion detection (detection of emotion in music sounds). The Daubechies Wavelet Coefficient Histograms (proposed by Li, Ogihara, and Li), which consist of moments of the coefficients calculated by applying the Db8 wavelet filter, are combined with the timbral features extracted using the MARSYAS system of Tzanetakis and Cook, to generate compact music features. For similarity search, the distance between two sound files is defined to be the Euclidean distance of their normalized representations. Based on the distance measure the closest sound files to an input sound file is obtained. Experiments on Jazz vocal and Classical sound files achieve a very high level of accuracy. Emotion detection is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of Support Vector Machines trained on the extracted features. Our experiments on emotion detection achieved reasonably accurate performance and provided some insights on future work.

AB - This paper investigates the use of acoustic based features for music information retrieval. Two specific problems are studied: similarity search (searching for music sound files similar to a given music sound file) and emotion detection (detection of emotion in music sounds). The Daubechies Wavelet Coefficient Histograms (proposed by Li, Ogihara, and Li), which consist of moments of the coefficients calculated by applying the Db8 wavelet filter, are combined with the timbral features extracted using the MARSYAS system of Tzanetakis and Cook, to generate compact music features. For similarity search, the distance between two sound files is defined to be the Euclidean distance of their normalized representations. Based on the distance measure the closest sound files to an input sound file is obtained. Experiments on Jazz vocal and Classical sound files achieve a very high level of accuracy. Emotion detection is cast as a multiclass classification problem, decomposed as a multiple binary classification problem, and is resolved with the use of Support Vector Machines trained on the extracted features. Our experiments on emotion detection achieved reasonably accurate performance and provided some insights on future work.

UR - http://www.scopus.com/inward/record.url?scp=4544274781&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544274781&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:4544274781

VL - 5

BT - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

ER -