Deep learning-based approaches have demonstrated promising performance for speech enhancement (SE) tasks. However, these approaches generally require large quantities of training data and computational resources for model training. An alternate hierarchical extreme learning machine (HELM) model has been previously reported to perform SE and has demonstrated satisfactory results with a limited amount of training data. In this study, we investigate application of the HELM model to improve the quality and intelligibility of bone-conducted speech. Our experimental results show that the proposed HELM-based bone-conducted SE framework can effectively enhance the original bone-conducted speech and outperform a deep denoising autoencoder-based bone-conducted SE system in terms of speech quality and intelligibility with improved recognition accuracy when a limited quantity of training data is available.
Hussain, T., Tsao, Y.u., Siniscalchi, S.M., Wang, J., Wang, H., Liao, W. (2021). Bone-Conducted Speech Enhancement Using Hierarchical Extreme Learning Machine. In Increasing Naturalness and Flexibility in Spoken Dialogue Interaction (pp. 153-162). Springer Science and Business Media Deutschland GmbH [10.1007/978-981-15-9323-9_14].
Bone-Conducted Speech Enhancement Using Hierarchical Extreme Learning Machine
Siniscalchi, Sabato Marco;
2021-01-01
Abstract
Deep learning-based approaches have demonstrated promising performance for speech enhancement (SE) tasks. However, these approaches generally require large quantities of training data and computational resources for model training. An alternate hierarchical extreme learning machine (HELM) model has been previously reported to perform SE and has demonstrated satisfactory results with a limited amount of training data. In this study, we investigate application of the HELM model to improve the quality and intelligibility of bone-conducted speech. Our experimental results show that the proposed HELM-based bone-conducted SE framework can effectively enhance the original bone-conducted speech and outperform a deep denoising autoencoder-based bone-conducted SE system in terms of speech quality and intelligibility with improved recognition accuracy when a limited quantity of training data is available.| File | Dimensione | Formato | |
|---|---|---|---|
|
22326-F.pdf
Solo gestori archvio
Tipologia:
Post-print
Dimensione
395.88 kB
Formato
Adobe PDF
|
395.88 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
|
978-981-15-9323-9_14.pdf
Solo gestori archvio
Tipologia:
Versione Editoriale
Dimensione
405.65 kB
Formato
Adobe PDF
|
405.65 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


