Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

In this paper we extend attribute-based lattice rescoring to spontaneous speech recognition. This technique is based on two key features: (i) an attribute-based frontend, which consists of a bank of speech attribute detectors followed up by an evidence merger that generates confidence scores (e.g., sub-word posterior probabilities), and (ii) a rescoring module that integrates information generated by the frontend into an existing ASR engine through lattice rescoring. The speech attributes used in this work are phonetic features, such as frication and palatalization. Experimental results on the Switchboard part of the NIST 2000 Hub5 data set demonstrate that the proposed approach outperforms LVCSR systems based on Gaussian mixture model/ hidden Markov model (GMM/HMM) that does not use attribute related information. Furthermore, a small yet promising improvement is also observed when rescoring word-lattices generated by a state-of-the-art ASR system using deep neural networks. Different frontend configuration are investigated and tested. © 2014 IEEE.

Chen I.-F., Siniscalchi S.M., Lee C.-H. (2014). Attribute based lattice rescoring in spontaneous speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 3325-3329). Institute of Electrical and Electronics Engineers Inc. [10.1109/ICASSP.2014.6854216].

Attribute based lattice rescoring in spontaneous speech recognition

Chen I. -F.;Siniscalchi S. M.;Lee C. -H.

2014-01-01

Abstract

In this paper we extend attribute-based lattice rescoring to spontaneous speech recognition. This technique is based on two key features: (i) an attribute-based frontend, which consists of a bank of speech attribute detectors followed up by an evidence merger that generates confidence scores (e.g., sub-word posterior probabilities), and (ii) a rescoring module that integrates information generated by the frontend into an existing ASR engine through lattice rescoring. The speech attributes used in this work are phonetic features, such as frication and palatalization. Experimental results on the Switchboard part of the NIST 2000 Hub5 data set demonstrate that the proposed approach outperforms LVCSR systems based on Gaussian mixture model/ hidden Markov model (GMM/HMM) that does not use attribute related information. Furthermore, a small yet promising improvement is also observed when rescoring word-lattices generated by a state-of-the-art ASR system using deep neural networks. Different frontend configuration are investigated and tested. © 2014 IEEE.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2014
			
	Settore scientifico disciplinare del contributo
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				9781479928927
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/ICASSP.2014.6854216
			
	Citazione
	
				Chen I.-F.,  Siniscalchi S.M.,  Lee C.-H. (2014). Attribute based lattice rescoring in spontaneous speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings (pp. 3325-3329). Institute of Electrical and Electronics Engineers Inc. [10.1109/ICASSP.2014.6854216].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
Attribute_based_lattice_rescoring_in_spontaneous_speech_recognition.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 88.69 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	88.69 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/673804

Citazioni

ND

10

0

social impact