Learning HMM State Sequences from Phonemes for Speech Synthesis

BIAGETTI, Giorgio; CRIPPA, Paolo; FALASCHETTI, LAURA; ORCIONI, Simone; TURCHETTI, Claudio

Repository landing page

oai:iris.univpm.it:11566/238344

Learning HMM State Sequences from Phonemes for Speech Synthesis

Authors: Giorgio BIAGETTI
Paolo CRIPPA
LAURA FALASCHETTI
Simone ORCIONI
Claudio TURCHETTI
Publication date: 1 January 2016
Publisher: Elsevier
Doi

Abstract

This paper presents a technique for learning hidden Markov model (HMM) state sequences from phonemes, that combined with modified discrete cosine transform (MDCT), is useful for speech synthesis. Mel-cepstral spectral parameters, currently adopted in the conventional methods as features for HMM acoustic modeling, do not ensure direct speech waveforms reconstruction. In contrast to these approaches, we use an analysis/synthesis technique based on MDCT that guarantees a perfect reconstruction of the signal frame feature vectors and allows for a 50% overlap between frames without increasing the data rate. Experimental results show that the spectrograms achieved with the suggested technique behave very closely to the original spectrograms, and the quality of synthesized speech is conveniently evaluated using the well known Itakura-Saito measure

Similar works

Full text

IRIS UniversitÃ Politecnica delle Marche

oai:iris.univpm.it:11566/23834...

Last time updated on 11/04/2020

This paper was published in IRIS UniversitÃ Politecnica delle Marche.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.