Self-supervised object detection and retrieval using unlabeled videos

Elad Amrani, Rami Ben-Ari, Inbar Shapira, Tal Hakim, Alex Bronstein

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Learning an object detection or retrieval system requires a large data set with manual annotations. Such data are expensive and time-consuming to create and therefore difficult to obtain on a large scale. In this work, we propose using the natural correlation in narrations and the visual presence of objects in video to learn an object detector and retriever without any manual labeling involved. We pose the problem as weakly supervised learning with noisy labels, and propose a novel object detection and retrieval paradigm under these constraints. We handle the background rejection by using contrastive samples and confront the high level of label noise with a new clustering score. Our evaluation is based on a set of ten objects with manual ground truth annotation in almost 5000 frames extracted from instructional videos from the web. We demonstrate superior results compared to state-of-the-art weakly- supervised approaches and report a strongly-labeled upper bound as well. While the focus of the paper is object detection and retrieval, the proposed methodology can be applied to a broader range of noisy weakly-supervised problems.

Original languageAmerican English
Title of host publicationProceedings - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020
PublisherIEEE Computer Society
Pages4100-4108
Number of pages9
ISBN (Electronic)9781728193601
DOIs
StatePublished - Jun 2020
Event2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020 - Virtual, Online, United States
Duration: 14 Jun 202019 Jun 2020

Publication series

NameIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
Volume2020-June

Conference

Conference2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2020
Country/TerritoryUnited States
CityVirtual, Online
Period14/06/2019/06/20

All Science Journal Classification (ASJC) codes

  • Computer Vision and Pattern Recognition
  • Electrical and Electronic Engineering

Cite this