Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

A reverberation-time-aware deep-neural-network (DNN)-based multi-channel speech dereverberation framework is proposed to handle a wide range of reverberation times (RT60s). There are three key steps in designing a robust system. First, to accomplish simultaneous speech dereverberation and beamforming, we propose a framework, namely DNNSpatial, by selectively concatenating log-power spectral (LPS) input features of reverberant speech from multiple microphones in an array and map them into the expected output LPS features of anechoic reference speech based on a single deep neural network (DNN). Next, the temporal auto-correlation function of received signals at different RT60s is investigated to show that RT60-dependent temporal-spatial contexts in feature selection are needed in the DNNSpatial training stage in order to optimize the system performance in diverse reverberant environments. Finally, the RT60 is estimated to select the proper temporal and spatial contexts before feeding the log-power spectrum features to the trained DNNs for speech dereverberation. The experimental evidence gathered in this study indicates that the proposed framework outperforms the state-of-the-art signal processing dereverberation algorithm weighted prediction error (WPE) and conventional DNNSpatial systems without taking the reverberation time into account, even for extremely weak and severe reverberant conditions. The proposed technique generalizes well to unseen room size, array geometry and loudspeaker position, and is robust to reverberation time estimation error

Wu B., Yang M., Li K., Huang Z., Siniscalchi S.M., Wang T., et al. (2017). A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2017(1) [10.1186/s13634-017-0516-6].

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation

Yang M.;Li K.;Huang Z.;Siniscalchi S. M.^{Writing – Original Draft Preparation};Wang T.;Lee C. -H.

2017-01-01

Abstract

A reverberation-time-aware deep-neural-network (DNN)-based multi-channel speech dereverberation framework is proposed to handle a wide range of reverberation times (RT60s). There are three key steps in designing a robust system. First, to accomplish simultaneous speech dereverberation and beamforming, we propose a framework, namely DNNSpatial, by selectively concatenating log-power spectral (LPS) input features of reverberant speech from multiple microphones in an array and map them into the expected output LPS features of anechoic reference speech based on a single deep neural network (DNN). Next, the temporal auto-correlation function of received signals at different RT60s is investigated to show that RT60-dependent temporal-spatial contexts in feature selection are needed in the DNNSpatial training stage in order to optimize the system performance in diverse reverberant environments. Finally, the RT60 is estimated to select the proper temporal and spatial contexts before feeding the log-power spectrum features to the trained DNNs for speech dereverberation. The experimental evidence gathered in this study indicates that the proposed framework outperforms the state-of-the-art signal processing dereverberation algorithm weighted prediction error (WPE) and conventional DNNSpatial systems without taking the reverberation time into account, even for extremely weak and severe reverberant conditions. The proposed technique generalizes well to unseen room size, array geometry and loudspeaker position, and is robust to reverberation time estimation error

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2017
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1186/s13634-017-0516-6
			
	URL dell'editore (Open access ove possibile)
	
				https://asp-eurasipjournals.springeropen.com/articles/10.1186/s13634-017-0516-6
			
	Citazione
	
				Wu B.,  Yang M.,  Li K.,  Huang Z.,  Siniscalchi S.M.,  Wang T., et al. (2017). A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2017(1) [10.1186/s13634-017-0516-6].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
s13634-017-0516-6.pdf accesso aperto Tipologia: Versione Editoriale Dimensione 2.01 MB Formato Adobe PDF Visualizza/Apri	2.01 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/649497

Citazioni

ND

10

6

social impact