Single-object visual tracking aims at locating a target in each video frame by predicting the bounding box of the object. Recent approaches have adopted iterative procedures to gradually refine the bounding box and locate the target in the image. In such approaches, the deep model takes as input the image patch corresponding to the currently estimated target bounding box, and provides as output the probability associated with each of the possible bounding box refinements, generally defined as a discrete set of linear transformations of the bounding box center and size. At each iteration, only one transformation is applied, and supervised training of the model may introduce an inherent ambiguity by giving importance priority to some transformations over the others. This paper proposes a novel formulation of the problem of selecting the bounding box refinement. It introduces the concept of non-conflicting transformations and allows applying multiple refinements to the target bounding box at each iteration without introducing ambiguities during learning of the model parameters. Empirical results demonstrate that the proposed approach improves the iterative single refinement in terms of accuracy and precision of the tracking results.

Giorgio Cruciata, L.L.P. (2022). Iterative Multiple Bounding-Box Refinements for Visual Tracking. JOURNAL OF IMAGING, 8(3) [10.3390/jimaging8030061].

Iterative Multiple Bounding-Box Refinements for Visual Tracking

Giorgio Cruciata
Primo
;
Liliana Lo Presti
Secondo
;
Marco La Cascia
Ultimo
2022-03-01

Abstract

Single-object visual tracking aims at locating a target in each video frame by predicting the bounding box of the object. Recent approaches have adopted iterative procedures to gradually refine the bounding box and locate the target in the image. In such approaches, the deep model takes as input the image patch corresponding to the currently estimated target bounding box, and provides as output the probability associated with each of the possible bounding box refinements, generally defined as a discrete set of linear transformations of the bounding box center and size. At each iteration, only one transformation is applied, and supervised training of the model may introduce an inherent ambiguity by giving importance priority to some transformations over the others. This paper proposes a novel formulation of the problem of selecting the bounding box refinement. It introduces the concept of non-conflicting transformations and allows applying multiple refinements to the target bounding box at each iteration without introducing ambiguities during learning of the model parameters. Empirical results demonstrate that the proposed approach improves the iterative single refinement in terms of accuracy and precision of the tracking results.
mar-2022
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni
Giorgio Cruciata, L.L.P. (2022). Iterative Multiple Bounding-Box Refinements for Visual Tracking. JOURNAL OF IMAGING, 8(3) [10.3390/jimaging8030061].
File in questo prodotto:
File Dimensione Formato  
jimaging-08-00061.pdf

accesso aperto

Descrizione: Articolo
Tipologia: Versione Editoriale
Dimensione 1.2 MB
Formato Adobe PDF
1.2 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/539376
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 1
social impact