Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Recently, the hierarchical extreme learning machine (HELM) model has been utilized for speech enhancement (SE) and demonstrated promising performance, especially when the amount of training data is limited and the system does not support heavy computations. Based on the success of audio-onlybased systems, termed AHELM, we propose a novel audio-visual HELM-based SE system, termed AVHELM that integrates the audio and visual information to confrontate the unseen nonstationery noise problem at low SNR levels to attain improved SE performance. The experimental results demonstrate that AVHELM can yield satisfactory enhancement performance with a limited amount of training data and outperforms AHELM in terms of three standardized objective measures under matched and mismatched testing conditions, confirming the effectiveness of incorporating visual information into the HELM-based SE system.

Hussain, T., Tsao, Y.u., Wang, H., Wang, J., Siniscalchi, S.M., Liao, W. (2019). Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine. In European Signal Processing Conference (pp. 1-5). IEEE [10.23919/EUSIPCO.2019.8903105].

Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine

Tsao, Yu;Wang, Hsin-Min;Wang, Jia-Ching;Siniscalchi, Sabato Marco^Supervision;Liao, Wen-Hung

2019-01-01

Abstract

Recently, the hierarchical extreme learning machine (HELM) model has been utilized for speech enhancement (SE) and demonstrated promising performance, especially when the amount of training data is limited and the system does not support heavy computations. Based on the success of audio-onlybased systems, termed AHELM, we propose a novel audio-visual HELM-based SE system, termed AVHELM that integrates the audio and visual information to confrontate the unseen nonstationery noise problem at low SNR levels to attain improved SE performance. The experimental results demonstrate that AVHELM can yield satisfactory enhancement performance with a limited amount of training data and outperforms AHELM in terms of three standardized objective measures under matched and mismatched testing conditions, confirming the effectiveness of incorporating visual information into the HELM-based SE system.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2019
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-9-0827-9703-9
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.23919/EUSIPCO.2019.8903105
			
	Citazione
	
				Hussain, T., Tsao, Y.u., Wang, H., Wang, J., Siniscalchi, S.M., Liao, W. (2019). Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine. In European Signal Processing Conference (pp. 1-5). IEEE [10.23919/EUSIPCO.2019.8903105].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
Audio-Visual_Speech_Enhancement_using_Hierarchical_Extreme_Learning_Machine.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 1.74 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.74 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/636655

Citazioni

ND

1

0

social impact