Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Hidden Markov models (HMMs) are powerful generative models for sequential data that have been used in automatic speech recognition for more than two decades. Despite their popularity, HMMs make inaccurate assumptions about speech signals, thereby limiting the achievable performance of the conventional speech recognizer. Penalized logistic regression (PLR) is a well-founded discriminative classifier with long roots in the history of statistics. Its classification performance is often compared with that of the popular support vector machine (SVM). However, for speech classification, only limited success with PLR has been reported, partially due to the difficulty with sequential data. In this paper, we present an elegant way of incorporating HMMs in the PLR framework. This leads to a powerful discriminative classifier that naturally handles sequential data. In this approach, speech classification is done using affine combinations of HMM log-likelihoods. We believe that such combinations of HMMs lead to a more accurate classifier than the conventional HMM-based classifier. Unlike similar approaches, we jointly estimate the HMM parameters and the PLR parameters using a single training criterion. The extension to continuous speech recognition is done via rescoring of N-best lists or lattices.

Birkenes O., Matsui T., Tanabe K., SINISCALCHI, S.M., Myrvoll T. A., Johnsen M. H. (2010). Penalized logistic regression with HMM log-likelihood regressors for speech recognition. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 18(6), 1440-1454 [10.1109/TASL.2009.2035151].

Penalized logistic regression with HMM log-likelihood regressors for speech recognition

Birkenes O.;Matsui T.;Tanabe K.;SINISCALCHI, SABATO MARCO;Myrvoll T. A.;Johnsen M. H.

2010-01-01

Abstract

Hidden Markov models (HMMs) are powerful generative models for sequential data that have been used in automatic speech recognition for more than two decades. Despite their popularity, HMMs make inaccurate assumptions about speech signals, thereby limiting the achievable performance of the conventional speech recognizer. Penalized logistic regression (PLR) is a well-founded discriminative classifier with long roots in the history of statistics. Its classification performance is often compared with that of the popular support vector machine (SVM). However, for speech classification, only limited success with PLR has been reported, partially due to the difficulty with sequential data. In this paper, we present an elegant way of incorporating HMMs in the PLR framework. This leads to a powerful discriminative classifier that naturally handles sequential data. In this approach, speech classification is done using affine combinations of HMM log-likelihoods. We believe that such combinations of HMMs lead to a more accurate classifier than the conventional HMM-based classifier. Unlike similar approaches, we jointly estimate the HMM parameters and the PLR parameters using a single training criterion. The extension to continuous speech recognition is done via rescoring of N-best lists or lattices.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2010
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/TASL.2009.2035151
			
	Citazione
	
				Birkenes O.,  Matsui T.,  Tanabe K., SINISCALCHI, S.M.,  Myrvoll T. A.,  Johnsen M. H. (2010). Penalized logistic regression with HMM log-likelihood regressors for speech recognition. IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 18(6), 1440-1454 [10.1109/TASL.2009.2035151].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
T-ASLP2010_.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 639.68 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	639.68 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/649531

Citazioni

ND

18

13

social impact