Robustness of LSTM neural networks for the enhancement of spectral parameters in noisy speech signals

Fecha

2019

Tipo

comunicación de congreso

Autores

Coto Jiménez, Marvin

Título de la revista

ISSN de la revista

Título del volumen

Editor

Resumen

In this paper, we carry out a comparative performance analysis of Long Short-term Memory (LSTM) Neural Networks for the task of noise reduction. Recent work in this area has shown the advantages of this kind of network for the enhancement of noisy speech, particularly when the training process is performed for specific Signal-to-Noise (SNR) levels. For application in real-life environments, it is important to test the robustness of the approach without the a priori knowledge of the SNR noise levels, as classical signal processing-based algorithms do. In our experiments, we conduct the training stage with single and multiple noise conditions and perform the comparison of the results with the specific SNR training presented previously in the literature. For the first time, results give a measure on the independence of the training conditions for the task of noise suppression in speech signals, and shows remarkable robustness of the LSTM for different SNR levels.

Descripción

Part of the Lecture Notes in Computer Science book series (LNCS, volume 11289).

Palabras clave

Deep learning, Long short-term memory (LSTM), Mel-Frequency Cepstrum Coefficients (MFCC), NEURAL NETWORKS, Speech enhancement