All-pole modeling of speech based on the minimum variance distortionless response spectrum

Manohar Murthi, Bhaskar D. Rao

Research output: Contribution to journalArticle

82 Citations (Scopus)

Abstract

In this paper, we present all-pole models based upon the minimum variance distortionless response (MVDR) spectrum for spectral modeling of speech. The MVDR method, which is popular in array processing, provides all-pole spectra that are robust for modeling both voiced and unvoiced speech. Although linear prediction (LP) is a popular method for obtaining all-pole model parameters, LP spectral envelopes overestimate and overemphasize the medium and high pitch voiced speech spectral powers, thereby featuring unwanted sharp contours, and do not improve in spectral envelope modeling performance as the filter order is increased. In contrast, the MVDR all-pole spectrum which can be easily obtained from the LP coefficients, features improved spectral envelope modeling as the filter order is increased. In particular, the high order MVDR spectrum models voiced speech spectra very well, particularly at the perceptually important harmonics, and features a smooth contoured envelope. Furthermore, the MVDR spectrum can be based upon either conventional time domain correlation estimates or upon spectral samples, a task that is common in frequency domain speech coding. In particular, the MVDR spectrum of sufficient order provides an all-pole envelope that models a set of spectral samples exactly. In addition, the MVDR all-pole spectrum is also suitable for modeling unvoiced speech spectra.

Original languageEnglish
Pages (from-to)221-239
Number of pages19
JournalIEEE Transactions on Speech and Audio Processing
Volume8
Issue number3
DOIs
StatePublished - Dec 1 2000
Externally publishedYes

Fingerprint

Poles
poles
envelopes
linear prediction
Speech coding
Array processing
filters
coding
harmonics
coefficients
estimates

Keywords

  • All-pole modeling
  • Capon spectrum
  • Linear prediction
  • Minimum variance distortionless response
  • Spectral envelope
  • Speech spectral estimation
  • Speech spectral modeling
  • Speech spectrum
  • Voiced speech modeling

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Acoustics and Ultrasonics

Cite this

All-pole modeling of speech based on the minimum variance distortionless response spectrum. / Murthi, Manohar; Rao, Bhaskar D.

In: IEEE Transactions on Speech and Audio Processing, Vol. 8, No. 3, 01.12.2000, p. 221-239.

Research output: Contribution to journalArticle

@article{cd976f3de7a3490fa0929e9129502f9b,
title = "All-pole modeling of speech based on the minimum variance distortionless response spectrum",
abstract = "In this paper, we present all-pole models based upon the minimum variance distortionless response (MVDR) spectrum for spectral modeling of speech. The MVDR method, which is popular in array processing, provides all-pole spectra that are robust for modeling both voiced and unvoiced speech. Although linear prediction (LP) is a popular method for obtaining all-pole model parameters, LP spectral envelopes overestimate and overemphasize the medium and high pitch voiced speech spectral powers, thereby featuring unwanted sharp contours, and do not improve in spectral envelope modeling performance as the filter order is increased. In contrast, the MVDR all-pole spectrum which can be easily obtained from the LP coefficients, features improved spectral envelope modeling as the filter order is increased. In particular, the high order MVDR spectrum models voiced speech spectra very well, particularly at the perceptually important harmonics, and features a smooth contoured envelope. Furthermore, the MVDR spectrum can be based upon either conventional time domain correlation estimates or upon spectral samples, a task that is common in frequency domain speech coding. In particular, the MVDR spectrum of sufficient order provides an all-pole envelope that models a set of spectral samples exactly. In addition, the MVDR all-pole spectrum is also suitable for modeling unvoiced speech spectra.",
keywords = "All-pole modeling, Capon spectrum, Linear prediction, Minimum variance distortionless response, Spectral envelope, Speech spectral estimation, Speech spectral modeling, Speech spectrum, Voiced speech modeling",
author = "Manohar Murthi and Rao, {Bhaskar D.}",
year = "2000",
month = "12",
day = "1",
doi = "10.1109/89.841206",
language = "English",
volume = "8",
pages = "221--239",
journal = "IEEE Transactions on Speech and Audio Processing",
issn = "1558-7916",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - All-pole modeling of speech based on the minimum variance distortionless response spectrum

AU - Murthi, Manohar

AU - Rao, Bhaskar D.

PY - 2000/12/1

Y1 - 2000/12/1

N2 - In this paper, we present all-pole models based upon the minimum variance distortionless response (MVDR) spectrum for spectral modeling of speech. The MVDR method, which is popular in array processing, provides all-pole spectra that are robust for modeling both voiced and unvoiced speech. Although linear prediction (LP) is a popular method for obtaining all-pole model parameters, LP spectral envelopes overestimate and overemphasize the medium and high pitch voiced speech spectral powers, thereby featuring unwanted sharp contours, and do not improve in spectral envelope modeling performance as the filter order is increased. In contrast, the MVDR all-pole spectrum which can be easily obtained from the LP coefficients, features improved spectral envelope modeling as the filter order is increased. In particular, the high order MVDR spectrum models voiced speech spectra very well, particularly at the perceptually important harmonics, and features a smooth contoured envelope. Furthermore, the MVDR spectrum can be based upon either conventional time domain correlation estimates or upon spectral samples, a task that is common in frequency domain speech coding. In particular, the MVDR spectrum of sufficient order provides an all-pole envelope that models a set of spectral samples exactly. In addition, the MVDR all-pole spectrum is also suitable for modeling unvoiced speech spectra.

AB - In this paper, we present all-pole models based upon the minimum variance distortionless response (MVDR) spectrum for spectral modeling of speech. The MVDR method, which is popular in array processing, provides all-pole spectra that are robust for modeling both voiced and unvoiced speech. Although linear prediction (LP) is a popular method for obtaining all-pole model parameters, LP spectral envelopes overestimate and overemphasize the medium and high pitch voiced speech spectral powers, thereby featuring unwanted sharp contours, and do not improve in spectral envelope modeling performance as the filter order is increased. In contrast, the MVDR all-pole spectrum which can be easily obtained from the LP coefficients, features improved spectral envelope modeling as the filter order is increased. In particular, the high order MVDR spectrum models voiced speech spectra very well, particularly at the perceptually important harmonics, and features a smooth contoured envelope. Furthermore, the MVDR spectrum can be based upon either conventional time domain correlation estimates or upon spectral samples, a task that is common in frequency domain speech coding. In particular, the MVDR spectrum of sufficient order provides an all-pole envelope that models a set of spectral samples exactly. In addition, the MVDR all-pole spectrum is also suitable for modeling unvoiced speech spectra.

KW - All-pole modeling

KW - Capon spectrum

KW - Linear prediction

KW - Minimum variance distortionless response

KW - Spectral envelope

KW - Speech spectral estimation

KW - Speech spectral modeling

KW - Speech spectrum

KW - Voiced speech modeling

UR - http://www.scopus.com/inward/record.url?scp=0000473547&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0000473547&partnerID=8YFLogxK

U2 - 10.1109/89.841206

DO - 10.1109/89.841206

M3 - Article

AN - SCOPUS:0000473547

VL - 8

SP - 221

EP - 239

JO - IEEE Transactions on Speech and Audio Processing

JF - IEEE Transactions on Speech and Audio Processing

SN - 1558-7916

IS - 3

ER -