We present a Bayesian approach to adapting parameters of a well-trained context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to improve automatic speech recognition performance. Given an abundance of DNN parameters but with only a limited amount of data, the effectiveness of the adapted DNN model can often be compromised. We formulate maximum a posteriori (MAP) adaptation of parameters of a specially designed CD-DNN-HMM with an augmented linear hidden networks connected to the output tied states, or senones, and compare it to feature space MAP linear regression previously proposed. Experimental evidences on the 20,000-word open vocabulary Wall Street Journal task demonstrate the feasibility of the proposed framework. In supervised adaptation, the proposed MAP adaptation approach provides more than 10% relative error reduction and consistently outperforms the conventional transformation based methods. Furthermore, we present an initial attempt to generate hierarchical priors to im- prove adaptation efficiency and effectiveness with limited adap- tation data by exploiting similarities among senones.
Huang, Z., SINISCALCHI, S.M., Chen, I.F., Li, L., Wu, J., L.e.e., ..H. (2015). Maximum a posteriori adaptation of network parameters in deep models. In INTERSPEECH 2015 (pp. 1076-1080). INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (ISCA).
Maximum a posteriori adaptation of network parameters in deep models
SINISCALCHI, SABATO MARCO;
2015-01-01
Abstract
We present a Bayesian approach to adapting parameters of a well-trained context-dependent, deep-neural-network, hidden Markov model (CD-DNN-HMM) to improve automatic speech recognition performance. Given an abundance of DNN parameters but with only a limited amount of data, the effectiveness of the adapted DNN model can often be compromised. We formulate maximum a posteriori (MAP) adaptation of parameters of a specially designed CD-DNN-HMM with an augmented linear hidden networks connected to the output tied states, or senones, and compare it to feature space MAP linear regression previously proposed. Experimental evidences on the 20,000-word open vocabulary Wall Street Journal task demonstrate the feasibility of the proposed framework. In supervised adaptation, the proposed MAP adaptation approach provides more than 10% relative error reduction and consistently outperforms the conventional transformation based methods. Furthermore, we present an initial attempt to generate hierarchical priors to im- prove adaptation efficiency and effectiveness with limited adap- tation data by exploiting similarities among senones.File | Dimensione | Formato | |
---|---|---|---|
IS150347.PDF
Solo gestori archvio
Descrizione: Il testo pieno dell’articolo è disponibile al seguente link: https://www.isca-archive.org/interspeech_2015/huang15b_interspeech.html
Tipologia:
Versione Editoriale
Dimensione
601.2 kB
Formato
Adobe PDF
|
601.2 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.