A frequency domain approach to ARX-LF voiced speech parameterization and synthesis

Alan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle

Research output: Contribution to journalConference articlepeer-review

Abstract

The ARX-LF model interprets voiced speech as the an LF derivative glottal pulse exciting an all-pole vocal tract filter with an additional exogenous residual signal. It fully parameterizes the voice and has been shown to be useful for voice modification. Because time domain methods to determine the ARX-LF parameters from speech are very sensitive to the time placement of the analysis frame and not robust to phase distortion from e.g. recording equipment, a magnitude-only spectral approach to ARX-LF parameterization was recently developed. This paper describes extensions to this frequency domain approach to obtain continuous robust ARX-LF parameters for voiced speech segments. A listening test of 50 participants comparing synthetic speech produced by this method with a time domain ARX-LF parameterization approach under real and simulated recording conditions was conducted and it was found that the frequency domain approach was generally preferred.

Original languageEnglish
Pages (from-to)57-60
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Publication statusPublished - 2011
Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
Duration: 27 Aug 201131 Aug 2011

Keywords

  • ARX-LF model
  • Speech synthesis
  • Voice coding

Fingerprint

Dive into the research topics of 'A frequency domain approach to ARX-LF voiced speech parameterization and synthesis'. Together they form a unique fingerprint.

Cite this