Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

We attempt to formulate Bayesian speaker adaptation for deep models and explore two different solutions. In the first “indirect” approach, Bayesian adaptation is applied to context-dependent, Gaussian-mixture-model based hidden Markov models (CD-GMM-HMMs) with bottleneck (BN) features derived from deep neural networks (DNNs). The second method directly formulates Bayesian adaptation for CD-DNN-HMMs by casting the adaptation step into a generative framework to formulate maximum-likelihood (ML) and maximum a posteriori (MAP) adaptation schemes. Experiments on the Wall Street Journal task demonstrate that both MAP and Structural MAP (SMAP) adaptation schemes are effective even with discriminative BN features. Furthermore, SMAP can attain a meaningful word error reduction (WERR) of 7.3% even when 80 hours of data, and 284 different speakers are available at training time. We have also observed a notable performance improvement with the indirect approach, and that supports the plausibility of proposed solution towards this novel direction.

Huang, Z., SINISCALCHI, S.M., Chen, I.F., Lee, C.H. (2017). Towards a direct Bayesian adaptation framework for deep models. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE [10.1109/APSIPA.2016.7820894].

Towards a direct Bayesian adaptation framework for deep models

Huang, Z.;SINISCALCHI, SABATO MARCO;Chen, I. F.;Lee, C. H.

2017-01-19

Abstract

We attempt to formulate Bayesian speaker adaptation for deep models and explore two different solutions. In the first “indirect” approach, Bayesian adaptation is applied to context-dependent, Gaussian-mixture-model based hidden Markov models (CD-GMM-HMMs) with bottleneck (BN) features derived from deep neural networks (DNNs). The second method directly formulates Bayesian adaptation for CD-DNN-HMMs by casting the adaptation step into a generative framework to formulate maximum-likelihood (ML) and maximum a posteriori (MAP) adaptation schemes. Experiments on the Wall Street Journal task demonstrate that both MAP and Structural MAP (SMAP) adaptation schemes are effective even with discriminative BN features. Furthermore, SMAP can attain a meaningful word error reduction (WERR) of 7.3% even when 80 hours of data, and 284 different speakers are available at training time. We have also observed a notable performance improvement with the indirect approach, and that supports the plausibility of proposed solution towards this novel direction.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				19-gen-2017
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-988-14768-2-1
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/APSIPA.2016.7820894
			
	URL dell'editore (Open access ove possibile)
	
				https://ieeexplore.ieee.org/document/7820894
			
	Citazione
	
				Huang, Z., SINISCALCHI, S.M., Chen, I.F., Lee, C.H. (2017). Towards a direct Bayesian adaptation framework for deep models. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE [10.1109/APSIPA.2016.7820894].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
Towards_a_direct_Bayesian_adaptation_framework_for_deep_models.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 290.5 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	290.5 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/649574

Citazioni

ND

0

1

social impact