Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

In this paper, we investigate a DNN tone-based extended recognition network (ERN) approach to Mandarin tone recognition and tone mispronunciation detection. Given a toneless syllable sequence, a tone-based ERN is constructed by assigning five different tones to each toneless syllable, obtaining a fully expanded tonal syllable network. Next, Viterbi decoding is carried out on the tone-based ERN to find the best tone sequence. With respect to the tone recognition task, different acoustic units, and DNN configurations are compared. The experimental results show that tonal phone and longer DNN input window achieve better recognition performance. Moreover, we have applied confidence score extracted from tone-based ERN to verify whether L2 learners' tones are correctly pronounced. Compared with the conventional tone-based GOP (Goodness of Pronunciation) system, the proposed framework reduces the equal error rate by 10.98% relative.

Li, W., SINISCALCHI, S.M., Chen, N.F., Lee, C.H. (2017). Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE [10.1109/APSIPA.2016.7820701].

Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations

SINISCALCHI, SABATO MARCO^{Co-primo

Writing – Original Draft Preparation};Chen, N. F.;Lee, C. H.

2017-01-01

Abstract

In this paper, we investigate a DNN tone-based extended recognition network (ERN) approach to Mandarin tone recognition and tone mispronunciation detection. Given a toneless syllable sequence, a tone-based ERN is constructed by assigning five different tones to each toneless syllable, obtaining a fully expanded tonal syllable network. Next, Viterbi decoding is carried out on the tone-based ERN to find the best tone sequence. With respect to the tone recognition task, different acoustic units, and DNN configurations are compared. The experimental results show that tonal phone and longer DNN input window achieve better recognition performance. Moreover, we have applied confidence score extracted from tone-based ERN to verify whether L2 learners' tones are correctly pronounced. Compared with the conventional tone-based GOP (Goodness of Pronunciation) system, the proposed framework reduces the equal error rate by 10.98% relative.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2017
			
	ISBN della monografia 
DATO PREVISTO SU LOGINMIUR
	
				978-988-14768-2-1
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/APSIPA.2016.7820701
			
	URL alternativo rispetto a quello dell'editore 
DATO PREVISTO SU LOGINMIUR
	
				http://ieeexplore.ieee.org/document/7820701/
			
	Citazione
	
				Li, W., SINISCALCHI, S.M., Chen, N.F., Lee, C.H. (2017). Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations. In 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA). IEEE [10.1109/APSIPA.2016.7820701].
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
07820701.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 215.49 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	215.49 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/649516

Citazioni

ND

5

0

social impact