Given the critical role of non-intrusive speech intelligibility assessment in hearing aids (HA), this paper enhances its performance by introducing Feature Importance across Domains (FiDo). We estimate feature importance on spectral and time-domain acoustic features as well as latent representations of Whisper. Importance weights are calculated per frame, and based on these weights, features are projected into new spaces, allowing the model to focus on important areas early. Next, feature concatenation is performed to combine the features before the assessment module processes them. Experimental results show that when FiDo is incorporated into the improved multi-branched speech intelligibility model MBI-Net+, RMSE can be reduced by 7.62% (from 26.10 to 24.11). MBI-Net+ with FiDo also achieves a relative RMSE reduction of 3.98% compared to the best system in the 2023 Clarity Prediction Challenge. These results validate FiDo's effectiveness in enhancing neural speech assessment in HA.

Zezario, R.E., Siniscalchi, S.M., Chen, F., Wang, H.-., Tsao, Y. (2025). Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 5473-5477). International Speech Communication Association [10.21437/Interspeech.2025-1756].

Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids

Siniscalchi S. M.;
2025-01-01

Abstract

Given the critical role of non-intrusive speech intelligibility assessment in hearing aids (HA), this paper enhances its performance by introducing Feature Importance across Domains (FiDo). We estimate feature importance on spectral and time-domain acoustic features as well as latent representations of Whisper. Importance weights are calculated per frame, and based on these weights, features are projected into new spaces, allowing the model to focus on important areas early. Next, feature concatenation is performed to combine the features before the assessment module processes them. Experimental results show that when FiDo is incorporated into the improved multi-branched speech intelligibility model MBI-Net+, RMSE can be reduced by 7.62% (from 26.10 to 24.11). MBI-Net+ with FiDo also achieves a relative RMSE reduction of 3.98% compared to the best system in the 2023 Clarity Prediction Challenge. These results validate FiDo's effectiveness in enhancing neural speech assessment in HA.
2025
Zezario, R.E., Siniscalchi, S.M., Chen, F., Wang, H.-., Tsao, Y. (2025). Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 5473-5477). International Speech Communication Association [10.21437/Interspeech.2025-1756].
File in questo prodotto:
File Dimensione Formato  
zezario25_interspeech.pdf

Solo gestori archvio

Descrizione: Il testo pieno dell’articolo è disponibile al seguente link: https://www.isca-archive.org/interspeech_2025/zezario25_interspeech.html#
Tipologia: Versione Editoriale
Dimensione 867.4 kB
Formato Adobe PDF
867.4 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
2507.23223v1.pdf

accesso aperto

Tipologia: Post-print
Dimensione 833.72 kB
Formato Adobe PDF
833.72 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/694128
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact