TY - GEN
T1 - Exploring long-term temporal trends in the use of Multiword Expressions
AU - Daniel, Tal
AU - Last, Mark
N1 - Publisher Copyright: © 2016 Association for Computational Linguistics
PY - 2016/1/1
Y1 - 2016/1/1
N2 - Differentiating between outdated expressions and current expressions is not a trivial task for foreign language learners, and could be beneficial for lexicographers, as they examine expressions. Assuming that the usage of expressions over time can be represented by a time-series of their periodic frequencies over a large lexicographic corpus, we test the hypothesis that there exists an old-new relationship between the time-series of some synonymous expressions, a hint that a later expression has replaced an earlier one. Another hypothesis we test is that Multiword Expressions (MWEs) can be characterized by sparsity & frequency thresholds. Using a dataset of 1 million English books, we choose MWEs having the most positive or the most negative usage trends from a ready-made list of known MWEs. We identify synonyms of those expressions in a historical thesaurus and visualize the temporal relationships between the resulting expression pairs. Our empirical results indicate that old-new usage relationships do exist between some synonymous expressions, and that new candidate expressions, not found in dictionaries, can be found by analyzing usage trends.
AB - Differentiating between outdated expressions and current expressions is not a trivial task for foreign language learners, and could be beneficial for lexicographers, as they examine expressions. Assuming that the usage of expressions over time can be represented by a time-series of their periodic frequencies over a large lexicographic corpus, we test the hypothesis that there exists an old-new relationship between the time-series of some synonymous expressions, a hint that a later expression has replaced an earlier one. Another hypothesis we test is that Multiword Expressions (MWEs) can be characterized by sparsity & frequency thresholds. Using a dataset of 1 million English books, we choose MWEs having the most positive or the most negative usage trends from a ready-made list of known MWEs. We identify synonyms of those expressions in a historical thesaurus and visualize the temporal relationships between the resulting expression pairs. Our empirical results indicate that old-new usage relationships do exist between some synonymous expressions, and that new candidate expressions, not found in dictionaries, can be found by analyzing usage trends.
UR - http://www.scopus.com/inward/record.url?scp=85064437943&partnerID=8YFLogxK
U2 - https://doi.org/10.18653/v1/w16-1802
DO - https://doi.org/10.18653/v1/w16-1802
M3 - Conference contribution
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 11
EP - 20
BT - Proceedings of the 12th Workshop on Multiword Expressions, MWE 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
A2 - Kordoni, Valia
A2 - Cholakov, Kostadin
A2 - Egg, Markus
A2 - Markantonatou, Stella
A2 - Nakov, Preslav
PB - Association for Computational Linguistics (ACL)
T2 - 12th Workshop on Multiword Expressions, MWE 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Y2 - 11 August 2016
ER -