"Clustering by Composition" - Unsupervised Discovery of Image Categories

Alon Faktor, Michal Irani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We define a "good image cluster" as one in which images can be easily composed (like a puzzle) using pieces from each other, while are difficult to compose from images outside the cluster. The larger and more statistically significant the pieces are, the stronger the affinity between the images. This gives rise to unsupervised discovery of very challenging image categories. We further show how multiple images can be composed from each other simultaneously and efficiently using a collaborative randomized search algorithm. This collaborative process exploits the "wisdom of crowds of images", to obtain a sparse yet meaningful set of image affinities, and in time which is almost linear in the size of the image collection. "Clustering-by-Composition" yields state-of-the-art results on current benchmark data sets. It further yields promising results on new challenging data sets, such as data sets with very few images (where a 'cluster model' cannot be 'learned' by current methods), and a subset of the PASCAL VOC data set (with huge variability in scale and appearance).
Original languageEnglish
Title of host publicationComputer Vision – ECCV 2012
Subtitle of host publication12th European Conference on Computer Vision, Florence, Italy, October 7-13, 2012. Proceedings, Part VII
EditorsAndrew Fitzgibbon
PublisherSpringer Verlag
Pages474-487
Number of pages15
Volume36
Edition6
ISBN (Print)978-3-642-33785-7
DOIs
StatePublished - 2012
Event12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy
Duration: 7 Oct 201213 Oct 2012

Publication series

NameLecture Notes in Computer Science
Volume7578
ISSN (Print)0302-9743

Conference

Conference12th European Conference on Computer Vision, ECCV 2012
Country/TerritoryItaly
CityFlorence
Period7/10/1213/10/12

Fingerprint

Dive into the research topics of '"Clustering by Composition" - Unsupervised Discovery of Image Categories'. Together they form a unique fingerprint.

Cite this