Polyphonic audio key finding using the spiral array CEG algorithm

Ching-Hua Chuan, Elaine Chew

Research output: Chapter in Book/Report/Conference proceedingConference contribution

30 Citations (Scopus)

Abstract

Key finding is an integral step in content-based music indexing and retrieval. In this paper, we present an O(n) real-time algorithm for determining key from polyphonic audio. We use the standard Fast Fourier Transform with a local maximum detection scheme to extract pitches and pitch strengths from polyphonic audio. Next, we use Chew's Spiral Array Center of Effect Generator (CEG) algorithm to determine the key from pitch strength information. We test the proposed system using Mozart's Symphonies. The test data is audio generated from MIDI source. The algorithm achieves a maximum correct key recognition rate of 96% within the first fifteen seconds, and exceeds 90% within the first three seconds. Starting from the extracted pitch strength information, we compare the CEG algorithm's performance to the classic Krumhansl-Schmuckler (K-S) probe tone profile method and Temperley's modified version of the K-S method. Correct key recognition rates for the K-S and modified K-S methods remain under 50% in the first three seconds, with maximum values of 80% and 87% respectively within the first fifteen seconds for the same test set. The CEG method consistently scores higher throughout the fifteen-second selections.

Original languageEnglish (US)
Title of host publicationIEEE International Conference on Multimedia and Expo, ICME 2005
Pages21-24
Number of pages4
Volume2005
DOIs
StatePublished - Dec 1 2005
Externally publishedYes
EventIEEE International Conference on Multimedia and Expo, ICME 2005 - Amsterdam, Netherlands
Duration: Jul 6 2005Jul 8 2005

Other

OtherIEEE International Conference on Multimedia and Expo, ICME 2005
CountryNetherlands
CityAmsterdam
Period7/6/057/8/05

Fingerprint

Fast Fourier transforms

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Chuan, C-H., & Chew, E. (2005). Polyphonic audio key finding using the spiral array CEG algorithm. In IEEE International Conference on Multimedia and Expo, ICME 2005 (Vol. 2005, pp. 21-24). [1521350] https://doi.org/10.1109/ICME.2005.1521350

Polyphonic audio key finding using the spiral array CEG algorithm. / Chuan, Ching-Hua; Chew, Elaine.

IEEE International Conference on Multimedia and Expo, ICME 2005. Vol. 2005 2005. p. 21-24 1521350.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Chuan, C-H & Chew, E 2005, Polyphonic audio key finding using the spiral array CEG algorithm. in IEEE International Conference on Multimedia and Expo, ICME 2005. vol. 2005, 1521350, pp. 21-24, IEEE International Conference on Multimedia and Expo, ICME 2005, Amsterdam, Netherlands, 7/6/05. https://doi.org/10.1109/ICME.2005.1521350
Chuan C-H, Chew E. Polyphonic audio key finding using the spiral array CEG algorithm. In IEEE International Conference on Multimedia and Expo, ICME 2005. Vol. 2005. 2005. p. 21-24. 1521350 https://doi.org/10.1109/ICME.2005.1521350
Chuan, Ching-Hua ; Chew, Elaine. / Polyphonic audio key finding using the spiral array CEG algorithm. IEEE International Conference on Multimedia and Expo, ICME 2005. Vol. 2005 2005. pp. 21-24
@inproceedings{b55e158b20cd42fab309f59a5e6b5a86,
title = "Polyphonic audio key finding using the spiral array CEG algorithm",
abstract = "Key finding is an integral step in content-based music indexing and retrieval. In this paper, we present an O(n) real-time algorithm for determining key from polyphonic audio. We use the standard Fast Fourier Transform with a local maximum detection scheme to extract pitches and pitch strengths from polyphonic audio. Next, we use Chew's Spiral Array Center of Effect Generator (CEG) algorithm to determine the key from pitch strength information. We test the proposed system using Mozart's Symphonies. The test data is audio generated from MIDI source. The algorithm achieves a maximum correct key recognition rate of 96{\%} within the first fifteen seconds, and exceeds 90{\%} within the first three seconds. Starting from the extracted pitch strength information, we compare the CEG algorithm's performance to the classic Krumhansl-Schmuckler (K-S) probe tone profile method and Temperley's modified version of the K-S method. Correct key recognition rates for the K-S and modified K-S methods remain under 50{\%} in the first three seconds, with maximum values of 80{\%} and 87{\%} respectively within the first fifteen seconds for the same test set. The CEG method consistently scores higher throughout the fifteen-second selections.",
author = "Ching-Hua Chuan and Elaine Chew",
year = "2005",
month = "12",
day = "1",
doi = "10.1109/ICME.2005.1521350",
language = "English (US)",
isbn = "0780393325",
volume = "2005",
pages = "21--24",
booktitle = "IEEE International Conference on Multimedia and Expo, ICME 2005",

}

TY - GEN

T1 - Polyphonic audio key finding using the spiral array CEG algorithm

AU - Chuan, Ching-Hua

AU - Chew, Elaine

PY - 2005/12/1

Y1 - 2005/12/1

N2 - Key finding is an integral step in content-based music indexing and retrieval. In this paper, we present an O(n) real-time algorithm for determining key from polyphonic audio. We use the standard Fast Fourier Transform with a local maximum detection scheme to extract pitches and pitch strengths from polyphonic audio. Next, we use Chew's Spiral Array Center of Effect Generator (CEG) algorithm to determine the key from pitch strength information. We test the proposed system using Mozart's Symphonies. The test data is audio generated from MIDI source. The algorithm achieves a maximum correct key recognition rate of 96% within the first fifteen seconds, and exceeds 90% within the first three seconds. Starting from the extracted pitch strength information, we compare the CEG algorithm's performance to the classic Krumhansl-Schmuckler (K-S) probe tone profile method and Temperley's modified version of the K-S method. Correct key recognition rates for the K-S and modified K-S methods remain under 50% in the first three seconds, with maximum values of 80% and 87% respectively within the first fifteen seconds for the same test set. The CEG method consistently scores higher throughout the fifteen-second selections.

AB - Key finding is an integral step in content-based music indexing and retrieval. In this paper, we present an O(n) real-time algorithm for determining key from polyphonic audio. We use the standard Fast Fourier Transform with a local maximum detection scheme to extract pitches and pitch strengths from polyphonic audio. Next, we use Chew's Spiral Array Center of Effect Generator (CEG) algorithm to determine the key from pitch strength information. We test the proposed system using Mozart's Symphonies. The test data is audio generated from MIDI source. The algorithm achieves a maximum correct key recognition rate of 96% within the first fifteen seconds, and exceeds 90% within the first three seconds. Starting from the extracted pitch strength information, we compare the CEG algorithm's performance to the classic Krumhansl-Schmuckler (K-S) probe tone profile method and Temperley's modified version of the K-S method. Correct key recognition rates for the K-S and modified K-S methods remain under 50% in the first three seconds, with maximum values of 80% and 87% respectively within the first fifteen seconds for the same test set. The CEG method consistently scores higher throughout the fifteen-second selections.

UR - http://www.scopus.com/inward/record.url?scp=33750539514&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33750539514&partnerID=8YFLogxK

U2 - 10.1109/ICME.2005.1521350

DO - 10.1109/ICME.2005.1521350

M3 - Conference contribution

SN - 0780393325

SN - 9780780393325

VL - 2005

SP - 21

EP - 24

BT - IEEE International Conference on Multimedia and Expo, ICME 2005

ER -