Analysis, enhancement and evaluation of five pitch determination techniques

Peter Veprek, Michael S. Scordilis

Research output: Contribution to journalArticle

49 Scopus citations

Abstract

Speech classification into voiced and unvoiced (or silent) portions is important in many speech processing applications. In addition, segmentation of voiced speech into individual pitch epochs is necessary in several high quality speech synthesis and coding techniques. This paper introduces criteria for measuring the performance of automatic procedures performing this task against manually segmented and labeled data. First, five basic pitch determination algorithms (PDAs) (SIFT, comb filter energy maximization, spectrum decimation/accumulation, optimal temporal similarity and dyadic wavelet transform) are evaluated and their performance is analyzed. A set of enhancements is then developed and applied to the basic algorithms, which yields superior performance by virtually eliminating multiple and sub-multiple pitch assignment errors and reducing all other errors. Evaluation shows that the enhancements improved performance of all five PDAs with the improvement ranging from 3.5% for the comb filter energy maximization method to 8.3% for the dyadic wavelet transform method.

Original languageEnglish (US)
Pages (from-to)249-270
Number of pages22
JournalSpeech Communication
Volume37
Issue number3-4
DOIs
StatePublished - Jul 1 2002

Keywords

  • Pitch determination
  • Speech analysis
  • Speech segmentation

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Experimental and Cognitive Psychology
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Analysis, enhancement and evaluation of five pitch determination techniques'. Together they form a unique fingerprint.

  • Cite this