Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of the residual from previous layers through a vector quantized encoder. Furthermore, the representations at each layer are hierarchically linked to those at previous layers. We evaluate our method on the tasks of image reconstruction and generation. Experimental results demonstrate that the discrete representations learned by HR-VQVAE enable the decoder to reconstruct high-quality images with less distortion than the baseline methods, namely VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency of the learned representations. The hierarchical nature of HR-VQVAE i) reduces the decoding search time, making the method particularly suitable for high-load tasks and ii) allows to increase the codebook size without incurring the codebook collapse problem.

Adiban M., Stefanov K., Siniscalchi S.M., Salvi G. (2022). Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation. In BMVC 2022 - 33rd British Machine Vision Conference Proceedings. British Machine Vision Association, BMVA.

Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation

Stefanov K.;Siniscalchi S. M.^Supervision;Salvi G.

2022-01-01

Abstract

We propose a multi-layer variational autoencoder method, we call HR-VQVAE, that learns hierarchical discrete representations of the data. By utilizing a novel objective function, each layer in HR-VQVAE learns a discrete representation of the residual from previous layers through a vector quantized encoder. Furthermore, the representations at each layer are hierarchically linked to those at previous layers. We evaluate our method on the tasks of image reconstruction and generation. Experimental results demonstrate that the discrete representations learned by HR-VQVAE enable the decoder to reconstruct high-quality images with less distortion than the baseline methods, namely VQVAE and VQVAE-2. HR-VQVAE can also generate high-quality and diverse images that outperform state-of-the-art generative models, providing further verification of the efficiency of the learned representations. The hierarchical nature of HR-VQVAE i) reduces the decoding search time, making the method particularly suitable for high-load tasks and ii) allows to increase the codebook size without incurring the codebook collapse problem.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2022
			
	Citazione
	
				Adiban M.,  Stefanov K.,  Siniscalchi S.M.,  Salvi G. (2022). Hierarchical Residual Learning Based Vector Quantized Variational Autoencoder for Image Reconstruction and Generation. In BMVC 2022 - 33rd British Machine Vision Conference Proceedings. British Machine Vision Association, BMVA.
			
	Appare nelle tipologie:
	
				2.07 Contributo in atti di convegno pubblicato in volume

File in questo prodotto:

File	Dimensione	Formato
Hierarchical+Residual+Learning+Based+Vector+Quantized+Variational+Autoencoder+for+Image+Reconstruction+and+Generation-1-compresso.pdf accesso aperto Tipologia: Pre-print Dimensione 363.39 kB Formato Adobe PDF Visualizza/Apri	363.39 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/637533

Citazioni

ND

5

ND

social impact