Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed-up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the back-propagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83% for VGG-11 to 83.74% for ResNet-152. As for the accuracy, the impact depends on the depth of the model and the decrease is between 0.26% and 2.38% for VGGs and between 0.4 and 3.2% for ResNets. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.

Cruciata, G., Cruciata, L., Lo Presti, L., van Gemert, J., La Cascia, M. (2024). Learn & drop: fast learning of CNNs based on layer dropping. NEURAL COMPUTING & APPLICATIONS, 36, 10839-10851 [10.1007/s00521-024-09592-3].

Learn & drop: fast learning of CNNs based on layer dropping

Cruciata, Giorgio;Cruciata, Luca;Lo Presti, Liliana;van Gemert, Jan;La Cascia, Marco

2024-01-01

Abstract

This paper proposes a new method to improve the training efficiency of deep convolutional neural networks. During training, the method evaluates scores to measure how much each layer’s parameters change and whether the layer will continue learning or not. Based on these scores, the network is scaled down such that the number of parameters to be learned is reduced, yielding a speed-up in training. Unlike state-of-the-art methods that try to compress the network to be used in the inference phase or to limit the number of operations performed in the back-propagation phase, the proposed method is novel in that it focuses on reducing the number of operations performed by the network in the forward propagation during training. The proposed training strategy has been validated on two widely used architecture families: VGG and ResNet. Experiments on MNIST, CIFAR-10 and Imagenette show that, with the proposed method, the training time of the models is more than halved without significantly impacting accuracy. The FLOPs reduction in the forward propagation during training ranges from 17.83% for VGG-11 to 83.74% for ResNet-152. As for the accuracy, the impact depends on the depth of the model and the decrease is between 0.26% and 2.38% for VGGs and between 0.4 and 3.2% for ResNets. These results demonstrate the effectiveness of the proposed technique in speeding up learning of CNNs. The technique will be especially useful in applications where fine-tuning or online training of convolutional models is required, for instance because data arrive sequentially.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2024
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				NEURAL COMPUTING & APPLICATIONS
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1007/s00521-024-09592-3
			
	URL dell'editore (Open access ove possibile)
	
				https://link.springer.com/article/10.1007/s00521-024-09592-3
			
	Citazione
	
				Cruciata, G., Cruciata, L., Lo Presti, L., van Gemert, J., La Cascia, M. (2024). Learn & drop: fast learning of CNNs based on layer dropping. NEURAL COMPUTING & APPLICATIONS, 36, 10839-10851 [10.1007/s00521-024-09592-3].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
s00521-024-09592-3.pdf accesso aperto Tipologia: Versione Editoriale Dimensione 1.7 MB Formato Adobe PDF Visualizza/Apri	1.7 MB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/631413

Citazioni

ND

0

ND

social impact