Pre-training Long Short-term Memory neural networks for efficient regression in artificial speech postfiltering

Coto Jiménez, Marvin

Pre-training Long Short-term Memory neural networks for efficient regression in artificial speech postfiltering

dc.creator	Coto Jiménez, Marvin
dc.date.accessioned	2022-03-25T20:04:54Z
dc.date.available	2022-03-25T20:04:54Z
dc.date.issued	2018
dc.description.abstract	Several attempts to enhance statistical parametric speech synthesis have contemplated deep-learning-based postfilters, which learn to perform a mapping of the synthetic speech parameters to the natural ones, reducing the gap between them. In this paper, we introduce a new pre-training approach for neural networks, applied in LSTM-based postfilters for speech synthesis, with the objective of enhancing the quality of the synthesized speech in a more efficient manner. Our approach begins with an auto-regressive training of one LSTM network, whose is used as an initialization for postfilters based on a denoising autoencoder architecture. We show the advantages of this initialization on a set of multi-stream postfilters, which encompass a collection of denoising autoencoders for the set of MFCC and fundamental frequency parameters of the artificial voice. Results show that the initialization succeeds in lowering the training time of the LSTM networks and achieves better results in enhancing the statistical parametric speech in most cases, when compared to the common random-initialized approach of the networks.	es_ES
dc.description.procedence	UCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctrica	es_ES
dc.identifier.citation	https://ieeexplore.ieee.org/document/8464204	es_ES
dc.identifier.doi	https://doi.org/10.1109/IWOBI.2018.8464204
dc.identifier.isbn	978-1-5386-7506-9
dc.identifier.uri	https://hdl.handle.net/10669/86291
dc.language.iso	eng	es_ES
dc.source	IEEE International Work Conference on Bioinspired Intelligence (IWOBI). San Carlos, Costa Rica. 18-20 de julio de 2018	es_ES
dc.subject	Deep learning	es_ES
dc.subject	Denoising autoencoders	es_ES
dc.subject	Long short-term memory (LSTM)	es_ES
dc.subject	Machine learning	es_ES
dc.subject	Signal processing	es_ES
dc.subject	Speech synthesis	es_ES
dc.title	Pre-training Long Short-term Memory neural networks for efficient regression in artificial speech postfiltering	es_ES
dc.type	comunicación de congreso	es_ES

Files

Original bundle

Now showing 1 - 1 of 1

Name:: PreTraining.pdf
Size:: 730.73 KB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 3.5 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Ingeniería eléctrica