Logo Kérwá
 

Improving post-filtering of artificial speech using pre-trained LSTM neural networks

dc.creatorCoto Jiménez, Marvin
dc.date.accessioned2022-03-24T16:39:07Z
dc.date.available2022-03-24T16:39:07Z
dc.date.issued2019
dc.description.abstractSeveral researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters separately and trying to reduce the gap between them. The Long Short-term Memory (LSTM) Neural Networks have been applied successfully in this purpose, but there are still many aspects to improve in the results and in the process itself. In this paper, we introduce a new pre-training approach for the LSTM, with the objective of enhancing the quality of the synthesized speech, particularly in the spectrum, in a more efficient manner. Our approach begins with an auto-associative training of one LSTM network, which is used as an initialization for the post-filters. We show the advantages of this initialization for the enhancing of the Mel-Frequency Cepstral parameters of synthetic speech. Results show that the initialization succeeds in achieving better results in enhancing the statistical parametric speech spectrum in most cases when compared to the common random initialization approach of the networks.es_ES
dc.description.procedenceUCR::Vicerrectoría de Docencia::Ingeniería::Facultad de Ingeniería::Escuela de Ingeniería Eléctricaes_ES
dc.description.sponsorshipUniversidad de Costa Rica/[322-B9-105]/UCR/Costa Ricaes_ES
dc.identifier.citationhttps://www.mdpi.com/2313-7673/4/2/39es_ES
dc.identifier.codproyecto322-B9-105
dc.identifier.doi10.3390/biomimetics4020039
dc.identifier.issn2313-7673
dc.identifier.urihttps://hdl.handle.net/10669/86280
dc.language.isoenges_ES
dc.sourceBiomimetics, vol.4(2), pp.1-17.es_ES
dc.subjectDeep learninges_ES
dc.subjectLong short-term memory (LSTM)es_ES
dc.subjectMachine learninges_ES
dc.subjectPost-filteringes_ES
dc.subjectSignal processinges_ES
dc.subjectSpeech synthesises_ES
dc.titleImproving post-filtering of artificial speech using pre-trained LSTM neural networkses_ES
dc.typeartículo originales_ES

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
biomimetics-04-00039-v2.pdf
Size:
427.86 KB
Format:
Adobe Portable Document Format
Description:
Artículo principal

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
3.5 KB
Format:
Item-specific license agreed upon to submission
Description: