Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

This paper expands a previously proposed universal acoustic characterization approach to spoken language identification (LID) by studying different ways of modeling attributes to improve language recognition. The motivation is to describe any spoken language with a common set of fundamental units. Thus, a spoken utterance is first tokenized into a sequence of universal attributes. Then a vector space modeling approach delivers the final LID decision. Context-dependent attribute models are now used to better capture spectral and temporal characteristics. Also, an approach to expand the set of attributes to increase the acoustic resolution is studied. Our experiments show that the tokenization accuracy positively affects LID results by producing a 2.8% absolute improvement over our previous 30-second NIST 2003 performance. This result also compares favorably with the best results on the same task known by the authors when the tokenizers are trained on language-dependent OGI-TS data.

S. M. SINISCALCHI, J. REED, T. SVENDSEN, AND C.-H. LEE (2010). Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition. In INTERSPEECH (pp. 2718-2721). ISCA-INT SPEECH COMMUNICATION ASSOC,.

Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition

S. M. SINISCALCHI;J. REED;T. SVENDSEN;AND C.-H. LEE

2010-01-01

Abstract

This paper expands a previously proposed universal acoustic characterization approach to spoken language identification (LID) by studying different ways of modeling attributes to improve language recognition. The motivation is to describe any spoken language with a common set of fundamental units. Thus, a spoken utterance is first tokenized into a sequence of universal attributes. Then a vector space modeling approach delivers the final LID decision. Context-dependent attribute models are now used to better capture spectral and temporal characteristics. Also, an approach to expand the set of attributes to increase the acoustic resolution is studied. Our experiments show that the tokenization accuracy positively affects LID results by producing a 2.8% absolute improvement over our previous 30-second NIST 2003 performance. This result also compares favorably with the best results on the same task known by the authors when the tokenizers are trained on language-dependent OGI-TS data.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2010
			
	Settore scientifico disciplinare del contributo
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-1-60423-449-7
			
	Citazione
	
				S. M. SINISCALCHI,  J. REED,  T. SVENDSEN,  AND C.-H. LEE (2010). Exploiting context-dependency and acoustic resolution of universal speech attribute models in spoken language recognition. In INTERSPEECH (pp. 2718-2721). ISCA-INT SPEECH COMMUNICATION ASSOC,.
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
INTERSPEECH_2010.pdf Solo gestori archvio Dimensione 472.79 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	472.79 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/670065

Citazioni

ND

14

4

social impact