Keyframe extraction methods aim to find in a video sequence the most significant frames, according to specific criteria. In this paper we propose a new method to search, in a video database, for frames that are related to a given keyword, and to extract the best ones, according to a proposed quality factor. We first exploit a speech to text algorithm to extract automatic captions from all the video in a specific domain database. Then we select only those sequences (clips), whose captions include a given keyword, thus discarding a lot of information that is useless for our purposes. Each retrieved clip is then divided into shots, using a video segmentation method, that is based on the SURF descriptors and keypoints. The sentence of the caption is projected onto the segmented clip, and we select the shot that includes the input keyword. The selected shot is further inspected to find good quality and stable parts, and the frame which maximizes a quality metric is selected as the best and the most significant frame. We compare the proposed algorithm with another keyframe extraction method based on local features, in terms of Significance and Quality.
Ardizzone E., La Cascia M., Mazzola G. (2015). Keyword Based Keyframe Extraction in Online Video Collections. In Proceedings of the International Conference on Pattern Recognition Applications and Methods - (Volume 2) (pp. 170-177). SCITEPRESS [10.5220/0005190001700177].
Keyword Based Keyframe Extraction in Online Video Collections
ARDIZZONE, Edoardo;LA CASCIA, Marco;MAZZOLA, Giuseppe
2015-01-01
Abstract
Keyframe extraction methods aim to find in a video sequence the most significant frames, according to specific criteria. In this paper we propose a new method to search, in a video database, for frames that are related to a given keyword, and to extract the best ones, according to a proposed quality factor. We first exploit a speech to text algorithm to extract automatic captions from all the video in a specific domain database. Then we select only those sequences (clips), whose captions include a given keyword, thus discarding a lot of information that is useless for our purposes. Each retrieved clip is then divided into shots, using a video segmentation method, that is based on the SURF descriptors and keypoints. The sentence of the caption is projected onto the segmented clip, and we select the shot that includes the input keyword. The selected shot is further inspected to find good quality and stable parts, and the frame which maximizes a quality metric is selected as the best and the most significant frame. We compare the proposed algorithm with another keyframe extraction method based on local features, in terms of Significance and Quality.File | Dimensione | Formato | |
---|---|---|---|
51900.pdf
Solo gestori archvio
Descrizione: Articolo principale
Tipologia:
Versione Editoriale
Dimensione
731.26 kB
Formato
Adobe PDF
|
731.26 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.