Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

In this work, we investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy condition within the deep neural network (DNN) framework. We claim that DNN vector-to-vector regression for speech enhancement (DNN-SE) can play a key role in AAI when used in a front-end stage to enhance speech features before AAI back-end processing. Our claim contrasts recent literature reporting a drop in AAI accuracy on MMSE enhanced data and thereby sheds some light on the opportunities offered by DNN-SE in robust speech applications. We have also tested single- and multi-task training strategies of the DNN-SE block and experimentally found the latter to be beneficial to AAI. Moreover, DNN-SE coupled with an AAI deep system tested on enhanced speech can outperform a multi-condition AAI deep system tested on noisy speech. We assess our approach on the Haskins corpus using the Pearson's correlation coefficient (PCC). A 15% relative PCC improvement is observed over a multi-condition AAI system at 0dB signal-to-noise ratio (SNR). Our approach also compares favorably against using a conventional DSP approach, namely MMSE with IMCRA, in the front-end stage.

Shahrebabaki, A.S., Siniscalchi, S.M., Salvi, G., Svendsen, T. (2021). A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion. In 53rd IEEE International Symposium on Circuits and Systems (pp. 1-5). IEEE [10.1109/ISCAS51556.2021.9401290].

A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion

Shahrebabaki, Abdolreza Sabzi;Siniscalchi, Sabato Marco;Salvi, Giampiero;Svendsen, Torbjorn

2021-01-01

Abstract

In this work, we investigate the problem of speaker independent acoustic-to-articulatory inversion (AAI) in noisy condition within the deep neural network (DNN) framework. We claim that DNN vector-to-vector regression for speech enhancement (DNN-SE) can play a key role in AAI when used in a front-end stage to enhance speech features before AAI back-end processing. Our claim contrasts recent literature reporting a drop in AAI accuracy on MMSE enhanced data and thereby sheds some light on the opportunities offered by DNN-SE in robust speech applications. We have also tested single- and multi-task training strategies of the DNN-SE block and experimentally found the latter to be beneficial to AAI. Moreover, DNN-SE coupled with an AAI deep system tested on enhanced speech can outperform a multi-condition AAI deep system tested on noisy speech. We assess our approach on the Haskins corpus using the Pearson's correlation coefficient (PCC). A 15% relative PCC improvement is observed over a multi-condition AAI system at 0dB signal-to-noise ratio (SNR). Our approach also compares favorably against using a conventional DSP approach, namely MMSE with IMCRA, in the front-end stage.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2021
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-1-7281-9201-7
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/ISCAS51556.2021.9401290
			
	URL alternativo rispetto a quello dell'editore 
DATO PREVISTO SU LOGINMIUR
	
				https://ieeexplore.ieee.org/abstract/document/9401290
			
	Citazione
	
				Shahrebabaki, A.S., Siniscalchi, S.M., Salvi, G., Svendsen, T. (2021). A DNN Based Speech Enhancement Approach to Noise Robust Acoustic-to-Articulatory Inversion. In 53rd IEEE International Symposium on Circuits and Systems (pp. 1-5). IEEE [10.1109/ISCAS51556.2021.9401290].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
shahrebabaki2021.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 2.18 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.18 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/636670

Citazioni

ND

1

0

social impact