Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Designing an automatic speech recognition system with little or no language-specific training data is a challenging research topic because collecting abundant speech training data is not always an easy job for all possible languages of interest. According to our previous studied detection-based paradigm, we used a set of 21 acoustic phonetic attributes shared by five languages to perform Japanese phone recognition without using any Japanese speech training data. In this paper, we address the key issue of designing attribute-to-phone mapping models by two techniques: (1) a phone-based background model for each of the speech attribute detector to improve attribute detection; and (2) a data-driven clustering algorithm to group attribute-to-phone mapping rules of known languages to predict such rules for target phones in an unseen language. We report on experimental results of continuous Japanese phone recognition with the OGI Multilingual Speech Corpus and show that the proposed approach indeed decreases the false rejection rate of attribute detection, and improves the phone recognition accuracy

D.-C. LYU, S. M. SINISCALCHI, T.-Y. KIM AND C.-H. LEE (2008). Continuous phone recognition without target language training data. In INTERSPEECH 2008 (pp. 2687-2690). ISCA-INST SPEECH COMMUNICATION ASSOC, C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, [10.21437/Interspeech.2008-666].

Continuous phone recognition without target language training data

D.-C. LYU;S. M. SINISCALCHI;T.-Y. KIM AND C.-H. LEE

2008-01-01

Abstract

Designing an automatic speech recognition system with little or no language-specific training data is a challenging research topic because collecting abundant speech training data is not always an easy job for all possible languages of interest. According to our previous studied detection-based paradigm, we used a set of 21 acoustic phonetic attributes shared by five languages to perform Japanese phone recognition without using any Japanese speech training data. In this paper, we address the key issue of designing attribute-to-phone mapping models by two techniques: (1) a phone-based background model for each of the speech attribute detector to improve attribute detection; and (2) a data-driven clustering algorithm to group attribute-to-phone mapping rules of known languages to predict such rules for target phones in an unseen language. We report on experimental results of continuous Japanese phone recognition with the OGI Multilingual Speech Corpus and show that the proposed approach indeed decreases the false rejection rate of attribute detection, and improves the phone recognition accuracy

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2008
			
	Settore scientifico disciplinare del contributo
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-1-61567-378-0
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.21437/Interspeech.2008-666
			
	URL dell'editore (Open access ove possibile)
	
				https://www.isca-archive.org/interspeech_2008/lyu08b_interspeech.html
			
	Citazione
	
				D.-C. LYU,  S. M. SINISCALCHI,  T.-Y. KIM AND C.-H. LEE (2008). Continuous phone recognition without target language training data. In INTERSPEECH 2008 (pp. 2687-2690). ISCA-INST SPEECH COMMUNICATION ASSOC, C/O EMMANUELLE FOXONET, 4 RUE DES FAUVETTES, LIEU DIT LOUS TOURILS, [10.21437/Interspeech.2008-666].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
INTERSPEECH_2008_Lyu.pdf Solo gestori archvio Descrizione: Il testo pieno dell’articolo è disponibile al seguente link: https://www.isca-archive.org/interspeech_2008/lyu08b_interspeech.html Tipologia: Versione Editoriale Dimensione 161.76 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	161.76 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/663740

Citazioni

ND

4

1

social impact