This study addresses the speech enhancement (SE) task within the causal inference paradigm by modeling the noise presence as an intervention. Based on the potential outcome framework, the proposed causal inference-based speech enhancement (CISE) separates clean and noisy frames in an intervened noisy speech using a noise detector and assigns both sets of frames to two mask-based enhancement modules (EMs) to perform noise-conditional SE. Specifically, we use the presence of noise as guidance for EM selection during training, and the noise detector selects the enhancement module according to the prediction of the presence of noise for each frame. Moreover, we derived an SE-specific average treatment effect to quantify the causal effect adequately. Experimental evidence demonstrates that CISE outperforms a non-causal mask-based SE approach in the studied settings and has better performance and efficiency than more complex SE models. Please find our implementation on GitHub.

Hsieh T.-A., Yang C.-H.H., Chen P.-Y., Siniscalchi S.M., Tsao Y. (2023). Inference and Denoise: Causal Inference-Based Neural Speech Enhancement. In IEEE International Workshop on Machine Learning for Signal Processing, MLSP. IEEE Computer Society [10.1109/MLSP55844.2023.10285967].

Inference and Denoise: Causal Inference-Based Neural Speech Enhancement

Siniscalchi S. M.
Co-ultimo
Supervision
;
2023-01-01

Abstract

This study addresses the speech enhancement (SE) task within the causal inference paradigm by modeling the noise presence as an intervention. Based on the potential outcome framework, the proposed causal inference-based speech enhancement (CISE) separates clean and noisy frames in an intervened noisy speech using a noise detector and assigns both sets of frames to two mask-based enhancement modules (EMs) to perform noise-conditional SE. Specifically, we use the presence of noise as guidance for EM selection during training, and the noise detector selects the enhancement module according to the prediction of the presence of noise for each frame. Moreover, we derived an SE-specific average treatment effect to quantify the causal effect adequately. Experimental evidence demonstrates that CISE outperforms a non-causal mask-based SE approach in the studied settings and has better performance and efficiency than more complex SE models. Please find our implementation on GitHub.
2023
Hsieh T.-A., Yang C.-H.H., Chen P.-Y., Siniscalchi S.M., Tsao Y. (2023). Inference and Denoise: Causal Inference-Based Neural Speech Enhancement. In IEEE International Workshop on Machine Learning for Signal Processing, MLSP. IEEE Computer Society [10.1109/MLSP55844.2023.10285967].
File in questo prodotto:
File Dimensione Formato  
2211.01189v1.pdf

accesso aperto

Descrizione: pre-print
Tipologia: Pre-print
Dimensione 833.48 kB
Formato Adobe PDF
833.48 kB Adobe PDF Visualizza/Apri
Inference_and_Denoise_Causal_Inference-Based_Neural_Speech_Enhancement.pdf

Solo gestori archvio

Tipologia: Versione Editoriale
Dimensione 441.2 kB
Formato Adobe PDF
441.2 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/637521
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact