Skip to main navigation Skip to search Skip to main content

Do you see what I mean? Visual resolution of linguistic ambiguities

Yevgeni Berzak, Andrei Barbu, Daniel Harari, Boris Katz, Shimon Ullman

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, representing a wide range of syntactic, semantic and discourse ambiguities, coupled with videos that visualize the different interpretations for each sentence. We address this task by extending a vision model which determines if a sentence is depicted by a video. We demonstrate how such a model can be adjusted to recognize different interpretations of the same underlying sentence, allowing to disambiguate sentences in a unified fashion across the different ambiguity types.

Original languageEnglish
Title of host publicationConference Proceedings - EMNLP 2015
Subtitle of host publicationConference on Empirical Methods in Natural Language Processing
Pages1477-1487
Number of pages11
ISBN (Electronic)9781941643327
DOIs
StatePublished - Sep 2015
EventConference on Empirical Methods in Natural Language Processing, EMNLP 2015 - Lisbon, Portugal
Duration: 17 Sep 201521 Sep 2015

Publication series

NameConference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing

Conference

ConferenceConference on Empirical Methods in Natural Language Processing, EMNLP 2015
Country/TerritoryPortugal
CityLisbon
Period17/09/1521/09/15

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Do you see what I mean? Visual resolution of linguistic ambiguities'. Together they form a unique fingerprint.

Cite this