Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

We present a cross-language knowledge integration framework to improve the performance in large vocabulary continuous speech recognition. Two types of knowledge sources, manner attribute and prosodic structure, are incorporated. For manner of articulation, cross-lingual attribute detectors trained with an American English corpus (WSJ0) are utilized to verify and rescore hypothesized Mandarin syllables in word lattices obtained with state-of-the-art systems. For the prosodic structure, models trained with an unsupervised joint prosody labeling and modeling technique using a Mandarin corpus (TCC300) are used in lattice rescoring. Experimental results on Mandarin syllable, character and word recognition with the TCC300 corpus show that the proposed approach significantly outperforms the baseline system that does not use articulatory and prosodic information. It also demonstrates a potential of utilizing results from cross-lingual attribute detectors as a language-universal frontend for automatic speech recognition.

Chen Yu Chiang, SINISCALCHI, S.M., Yih Ru Wang, Sin Horng Chen, Chin Hui Lee (2012). A study on cross-language knowledge integration in Mandarin LVCSR. In International Symposium on Chinese Spoken Language Processing (pp. 315-319). The Chinese University of Hong Kong,IEEE Signal Processing Society [10.1109/ISCSLP.2012.6423528].

A study on cross-language knowledge integration in Mandarin LVCSR

Chen Yu Chiang;SINISCALCHI, SABATO MARCO;Yih Ru Wang;Sin Horng Chen;Chin Hui Lee

2012-01-01

Abstract

We present a cross-language knowledge integration framework to improve the performance in large vocabulary continuous speech recognition. Two types of knowledge sources, manner attribute and prosodic structure, are incorporated. For manner of articulation, cross-lingual attribute detectors trained with an American English corpus (WSJ0) are utilized to verify and rescore hypothesized Mandarin syllables in word lattices obtained with state-of-the-art systems. For the prosodic structure, models trained with an unsupervised joint prosody labeling and modeling technique using a Mandarin corpus (TCC300) are used in lattice rescoring. Experimental results on Mandarin syllable, character and word recognition with the TCC300 corpus show that the proposed approach significantly outperforms the baseline system that does not use articulatory and prosodic information. It also demonstrates a potential of utilizing results from cross-lingual attribute detectors as a language-universal frontend for automatic speech recognition.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2012
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-1-4673-2507-3
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/ISCSLP.2012.6423528
			
	URL alternativo rispetto a quello dell'editore 
DATO PREVISTO SU LOGINMIUR
	
				http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6423528
			
	Citazione
	
				Chen Yu Chiang, SINISCALCHI, S.M.,  Yih Ru Wang,  Sin Horng Chen,  Chin Hui Lee (2012). A study on cross-language knowledge integration in Mandarin LVCSR. In International Symposium on Chinese Spoken Language Processing (pp. 315-319). The Chinese University of Hong Kong,IEEE Signal Processing Society [10.1109/ISCSLP.2012.6423528].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
06423528.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 341.09 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	341.09 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/663743

Citazioni

ND

9

5

social impact