Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer

Michael S Scordilis, John N. Gowdy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Summary form only given. The problems associated with automatic speech synthesis are related, to a large extent, to the methods of controlling the mathematical models of the human vocal tract and its properties as they change with time during discourse. In formant synthesis, which is the most effective synthesis method, rules are applied to relate the incoming phonemic information to values of the synthesizer control vectors. Such rules are usually developed through the analysis of a representative set of utterances and adjusted with listening tests. The tedious nature of the parameter extraction process and the lack of unambiguous relationships of acoustic events with spectral information have hindered the effective control of the models. In the present work, artificial neural networks were employed to assist with the latter concern. For this purpose, 56 common words comprising larynx-produced phonemes were analyzed and used to train a network cluster. The system was able to produce intelligible speech for certain phonemic combinations.

Original languageEnglish
Title of host publicationProceedings. IJCNN - International Joint Conference on Neural Networks
Editors Anon
Place of PublicationPiscataway, NJ, United States
PublisherPubl by IEEE
ISBN (Print)0780301641
StatePublished - Jan 1 1992
Externally publishedYes
EventInternational Joint Conference on Neural Networks - IJCNN-91-Seattle - Seattle, WA, USA
Duration: Jul 8 1991Jul 12 1991

Other

OtherInternational Joint Conference on Neural Networks - IJCNN-91-Seattle
CitySeattle, WA, USA
Period7/8/917/12/91

Fingerprint

Speech synthesis
Neural networks
Parameter extraction
Acoustics
Mathematical models

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Scordilis, M. S., & Gowdy, J. N. (1992). Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer. In Anon (Ed.), Proceedings. IJCNN - International Joint Conference on Neural Networks Piscataway, NJ, United States: Publ by IEEE.

Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer. / Scordilis, Michael S; Gowdy, John N.

Proceedings. IJCNN - International Joint Conference on Neural Networks. ed. / Anon. Piscataway, NJ, United States : Publ by IEEE, 1992.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Scordilis, MS & Gowdy, JN 1992, Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer. in Anon (ed.), Proceedings. IJCNN - International Joint Conference on Neural Networks. Publ by IEEE, Piscataway, NJ, United States, International Joint Conference on Neural Networks - IJCNN-91-Seattle, Seattle, WA, USA, 7/8/91.
Scordilis MS, Gowdy JN. Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer. In Anon, editor, Proceedings. IJCNN - International Joint Conference on Neural Networks. Piscataway, NJ, United States: Publ by IEEE. 1992
Scordilis, Michael S ; Gowdy, John N. / Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer. Proceedings. IJCNN - International Joint Conference on Neural Networks. editor / Anon. Piscataway, NJ, United States : Publ by IEEE, 1992.
@inproceedings{7db649a19e884b26926f4b8e8203ecda,
title = "Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer",
abstract = "Summary form only given. The problems associated with automatic speech synthesis are related, to a large extent, to the methods of controlling the mathematical models of the human vocal tract and its properties as they change with time during discourse. In formant synthesis, which is the most effective synthesis method, rules are applied to relate the incoming phonemic information to values of the synthesizer control vectors. Such rules are usually developed through the analysis of a representative set of utterances and adjusted with listening tests. The tedious nature of the parameter extraction process and the lack of unambiguous relationships of acoustic events with spectral information have hindered the effective control of the models. In the present work, artificial neural networks were employed to assist with the latter concern. For this purpose, 56 common words comprising larynx-produced phonemes were analyzed and used to train a network cluster. The system was able to produce intelligible speech for certain phonemic combinations.",
author = "Scordilis, {Michael S} and Gowdy, {John N.}",
year = "1992",
month = "1",
day = "1",
language = "English",
isbn = "0780301641",
editor = "Anon",
booktitle = "Proceedings. IJCNN - International Joint Conference on Neural Networks",
publisher = "Publ by IEEE",

}

TY - GEN

T1 - Speech synthesis of phonemic triplets through a neural network-controlled formant synthesizer

AU - Scordilis, Michael S

AU - Gowdy, John N.

PY - 1992/1/1

Y1 - 1992/1/1

N2 - Summary form only given. The problems associated with automatic speech synthesis are related, to a large extent, to the methods of controlling the mathematical models of the human vocal tract and its properties as they change with time during discourse. In formant synthesis, which is the most effective synthesis method, rules are applied to relate the incoming phonemic information to values of the synthesizer control vectors. Such rules are usually developed through the analysis of a representative set of utterances and adjusted with listening tests. The tedious nature of the parameter extraction process and the lack of unambiguous relationships of acoustic events with spectral information have hindered the effective control of the models. In the present work, artificial neural networks were employed to assist with the latter concern. For this purpose, 56 common words comprising larynx-produced phonemes were analyzed and used to train a network cluster. The system was able to produce intelligible speech for certain phonemic combinations.

AB - Summary form only given. The problems associated with automatic speech synthesis are related, to a large extent, to the methods of controlling the mathematical models of the human vocal tract and its properties as they change with time during discourse. In formant synthesis, which is the most effective synthesis method, rules are applied to relate the incoming phonemic information to values of the synthesizer control vectors. Such rules are usually developed through the analysis of a representative set of utterances and adjusted with listening tests. The tedious nature of the parameter extraction process and the lack of unambiguous relationships of acoustic events with spectral information have hindered the effective control of the models. In the present work, artificial neural networks were employed to assist with the latter concern. For this purpose, 56 common words comprising larynx-produced phonemes were analyzed and used to train a network cluster. The system was able to produce intelligible speech for certain phonemic combinations.

UR - http://www.scopus.com/inward/record.url?scp=0026711854&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0026711854&partnerID=8YFLogxK

M3 - Conference contribution

SN - 0780301641

BT - Proceedings. IJCNN - International Joint Conference on Neural Networks

A2 - Anon, null

PB - Publ by IEEE

CY - Piscataway, NJ, United States

ER -