Exploring long-term temporal trends in the use of Multiword Expressions

Tal Daniel, Mark Last

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Differentiating between outdated expressions and current expressions is not a trivial task for foreign language learners, and could be beneficial for lexicographers, as they examine expressions. Assuming that the usage of expressions over time can be represented by a time-series of their periodic frequencies over a large lexicographic corpus, we test the hypothesis that there exists an old-new relationship between the time-series of some synonymous expressions, a hint that a later expression has replaced an earlier one. Another hypothesis we test is that Multiword Expressions (MWEs) can be characterized by sparsity & frequency thresholds. Using a dataset of 1 million English books, we choose MWEs having the most positive or the most negative usage trends from a ready-made list of known MWEs. We identify synonyms of those expressions in a historical thesaurus and visualize the temporal relationships between the resulting expression pairs. Our empirical results indicate that old-new usage relationships do exist between some synonymous expressions, and that new candidate expressions, not found in dictionaries, can be found by analyzing usage trends.

Original languageAmerican English
Title of host publicationProceedings of the 12th Workshop on Multiword Expressions, MWE 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
EditorsValia Kordoni, Kostadin Cholakov, Markus Egg, Stella Markantonatou, Preslav Nakov
PublisherAssociation for Computational Linguistics (ACL)
Pages11-20
Number of pages10
ISBN (Electronic)9781945626067
DOIs
StatePublished - 1 Jan 2016
Event12th Workshop on Multiword Expressions, MWE 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 11 Aug 2016 → …

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics

Conference

Conference12th Workshop on Multiword Expressions, MWE 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period11/08/16 → …

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Exploring long-term temporal trends in the use of Multiword Expressions'. Together they form a unique fingerprint.

Cite this