Robot Instance Segmentation with Few Annotations for Grasping

Moshe Kimhi, David Vainshtein, Chaim Baskin, Dotan Di Castro

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability of robots to manipulate objects relies heavily on their aptitude for visual perception. In domains charac-terized by cluttered scenes and high object variability such as traffic, navigation and object grasping, most methods call for vast labeled datasets, laboriously hand-annotated, with the aim of training capable models. Once deployed, the challenge of generalizing to unfamiliar objects implies that the model must evolve alongside its domain. To address this, we propose a novel framework that combines Semi-Supervised Learning (SSL) with Learning Through Interaction (LTI), allowing a model to learn by observing scene alterations and leverage visual consistency despite tempo-ral gaps without requiring curated data of interaction se-quences. As a result, our approach exploits partially anno-tated data through self-supervision and incorporates temporal context using pseudo-sequences generated from unla-beled still images. We validate our method on two common benchmarks, ARMBench mix-object-tote and OCID, where it achieves state-of-the-art performance. Notably, on ARM-Bench, we attain an AP50 of 86.37, almost a 20% improvement over existing work, and obtain remarkable results in scenarios with extremely low annotation, achieving an AP50 score of 84.89 with just 1 % of annotated data compared to previous state of the art of 82 which targeted the fully anno-tated dataset.

Original languageEnglish
Title of host publicationProceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025
Pages7939-7949
Number of pages11
ISBN (Electronic)9798331510831
DOIs
StatePublished - 1 Jan 2025
Event2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 - Tucson, United States
Duration: 28 Feb 20254 Mar 2025

Publication series

NameProceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025

Conference

Conference2025 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025
Country/TerritoryUnited States
CityTucson
Period28/02/254/03/25

Keywords

  • computer vision
  • efficient learning
  • semi supervised learning

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition
  • Human-Computer Interaction
  • Modelling and Simulation
  • Radiology Nuclear Medicine and imaging

Fingerprint

Dive into the research topics of 'Robot Instance Segmentation with Few Annotations for Grasping'. Together they form a unique fingerprint.

Cite this