Sentence embedding evaluation using pyramid annotation

Tal Baumel, Raphael Cohen, Michael Elhadad

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Word embedding vectors are used as input for a variety of tasks. Choosing the right model and features for producing such vectors is not a trivial task and different embedding methods can greatly affect results. In this paper we repurpose the "Pyramid Method" annotations used for evaluating automatic summarization to create a benchmark for comparing embedding models when identifying paraphrases of text snippets containing a single clause. We present a method of converting pyramid annotation files into two distinct sentence embedding tests. We show that our method can produce a good amount of testing data, analyze the quality of the testing data, perform test on several leading embedding methods, and finally explain the downstream usages of our task and its significance.

Original languageAmerican English
Title of host publicationProceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
PublisherAssociation for Computational Linguistics (ACL)
Pages145-149
Number of pages5
ISBN (Electronic)9781945626142
DOIs
StatePublished - 1 Jan 2016
Event1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 7 Aug 2016 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics

Conference

Conference1st Workshop on Evaluating Vector-Space Representations for NLP, RepEval 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period7/08/16 → …

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Sentence embedding evaluation using pyramid annotation'. Together they form a unique fingerprint.

Cite this