Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

Model adaptation is a key technique that enables a modern automatic speech recognition (ASR) system to adjust its parameters, using a small amount of enrolment data, to the nuances in the speech spectrum due to microphone mismatch in the training and test data. In this brief, we investigate four different adaptation schemes for connectionist (also known as hybrid) ASR systems that learn microphone-specific hidden unit contributions, given some adaptation material. This solution is made possible adopting one of the following schemes: 1) the use of Hermite activation functions; 2) the introduction of bias and slope parameters in the sigmoid activation functions; 3) the injection of an amplitude parameter specific for each sigmoid unit; or 4) the combination of 2) and 3). Such a simple yet effective solution allows the adapted model to be stored in a small-sized storage space, a highly desirable property of adaptation algorithms for deep neural networks that are suitable for large-scale online deployment. Experimental results indicate that the investigated approaches reduce word error rates on the standard Spoke 6 task of the Wall Street Journal corpus compared with unadapted ASR systems. Moreover, the proposed adaptation schemes all perform better than simple multicondition training and comparable favorably against conventional linear regression-based approaches while using up to 15 orders of magnitude fewer parameters. The proposed adaptation strategies are also effective when a single adaptation sentence is available.

S. M. SINISCALCHI, V. M. Salerno (2017). Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 28(8), 1959-1965 [10.1109/TNNLS.2016.2550532].

Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions

S. M. SINISCALCHI^{Primo

Investigation};

2017-08-01

Abstract

Model adaptation is a key technique that enables a modern automatic speech recognition (ASR) system to adjust its parameters, using a small amount of enrolment data, to the nuances in the speech spectrum due to microphone mismatch in the training and test data. In this brief, we investigate four different adaptation schemes for connectionist (also known as hybrid) ASR systems that learn microphone-specific hidden unit contributions, given some adaptation material. This solution is made possible adopting one of the following schemes: 1) the use of Hermite activation functions; 2) the introduction of bias and slope parameters in the sigmoid activation functions; 3) the injection of an amplitude parameter specific for each sigmoid unit; or 4) the combination of 2) and 3). Such a simple yet effective solution allows the adapted model to be stored in a small-sized storage space, a highly desirable property of adaptation algorithms for deep neural networks that are suitable for large-scale online deployment. Experimental results indicate that the investigated approaches reduce word error rates on the standard Spoke 6 task of the Wall Street Journal corpus compared with unadapted ASR systems. Moreover, the proposed adaptation schemes all perform better than simple multicondition training and comparable favorably against conventional linear regression-based approaches while using up to 15 orders of magnitude fewer parameters. The proposed adaptation strategies are also effective when a single adaptation sentence is available.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				ago-2017
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1109/TNNLS.2016.2550532
			
	Citazione
	
				S. M. SINISCALCHI,  V. M. Salerno (2017). Adaptation to New Microphones Using Artificial Neural Networks With Trainable Activation Functions. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 28(8), 1959-1965 [10.1109/TNNLS.2016.2550532].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
07452664.pdf Solo gestori archvio Tipologia: Versione Editoriale Dimensione 532.14 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	532.14 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/649520

Citazioni

4

41

37

social impact