Given the critical role of non-intrusive speech intelligibility assessment in hearing aids (HA), this paper enhances its performance by introducing Feature Importance across Domains (FiDo). We estimate feature importance on spectral and time-domain acoustic features as well as latent representations of Whisper. Importance weights are calculated per frame, and based on these weights, features are projected into new spaces, allowing the model to focus on important areas early. Next, feature concatenation is performed to combine the features before the assessment module processes them. Experimental results show that when FiDo is incorporated into the improved multi-branched speech intelligibility model MBI-Net+, RMSE can be reduced by 7.62% (from 26.10 to 24.11). MBI-Net+ with FiDo also achieves a relative RMSE reduction of 3.98% compared to the best system in the 2023 Clarity Prediction Challenge. These results validate FiDo's effectiveness in enhancing neural speech assessment in HA.
Zezario, R.E., Siniscalchi, S.M., Chen, F., Wang, H.-., Tsao, Y. (2025). Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 5473-5477). International Speech Communication Association [10.21437/Interspeech.2025-1756].
Feature Importance across Domains for Improving Non-Intrusive Speech Intelligibility Prediction in Hearing Aids
Siniscalchi S. M.;
2025-01-01
Abstract
Given the critical role of non-intrusive speech intelligibility assessment in hearing aids (HA), this paper enhances its performance by introducing Feature Importance across Domains (FiDo). We estimate feature importance on spectral and time-domain acoustic features as well as latent representations of Whisper. Importance weights are calculated per frame, and based on these weights, features are projected into new spaces, allowing the model to focus on important areas early. Next, feature concatenation is performed to combine the features before the assessment module processes them. Experimental results show that when FiDo is incorporated into the improved multi-branched speech intelligibility model MBI-Net+, RMSE can be reduced by 7.62% (from 26.10 to 24.11). MBI-Net+ with FiDo also achieves a relative RMSE reduction of 3.98% compared to the best system in the 2023 Clarity Prediction Challenge. These results validate FiDo's effectiveness in enhancing neural speech assessment in HA.| File | Dimensione | Formato | |
|---|---|---|---|
|
zezario25_interspeech.pdf
Solo gestori archvio
Descrizione: Il testo pieno dell’articolo è disponibile al seguente link: https://www.isca-archive.org/interspeech_2025/zezario25_interspeech.html#
Tipologia:
Versione Editoriale
Dimensione
867.4 kB
Formato
Adobe PDF
|
867.4 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
2507.23223v1.pdf
accesso aperto
Tipologia:
Post-print
Dimensione
833.72 kB
Formato
Adobe PDF
|
833.72 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


