Multilingual Detection of Personal Employment Status on Twitter

Manuel Tonneau, Dhaval Adjodah, João Palotti, Nir Grinberg, Samuel Fraiberger

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Detecting disclosures of individuals' employment status on social media can provide valuable information to match job seekers with suitable vacancies, offer social protection, or measure labor market flows. However, identifying such personal disclosures is a challenging task due to their rarity in a sea of social media content and the variety of linguistic forms used to describe them. Here, we examine three Active Learning (AL) strategies in real-world settings of extreme class imbalance, and identify five types of disclosures about individuals' employment status (e.g. job loss) in three languages using BERT-based classification models. Our findings show that, even under extreme imbalance settings, a small number of AL iterations is sufficient to obtain large and significant gains in precision, recall, and diversity of results compared to a supervised baseline with the same number of labels. We also find that no AL strategy consistently outperforms the rest. Qualitative analysis suggests that AL helps focus the attention mechanism of BERT on core terms and adjust the boundaries of semantic expansion, highlighting the importance of interpretable models to provide greater control and visibility into this dynamic learning process.

Original languageAmerican English
Title of host publicationACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (Long Papers)
EditorsSmaranda Muresan, Preslav Nakov, Aline Villavicencio
PublisherAssociation for Computational Linguistics (ACL)
Pages6564-6587
Number of pages24
ISBN (Electronic)9781955917216
StatePublished - 1 Jan 2022
Event60th Annual Meeting of the Association for Computational Linguistics, ACL 2022 - Dublin, Ireland
Duration: 22 May 202227 May 2022
https://aclanthology.org/2022.acl-long.0/

Publication series

NameProceedings of the Annual Meeting of the Association for Computational Linguistics
Volume1

Conference

Conference60th Annual Meeting of the Association for Computational Linguistics, ACL 2022
Country/TerritoryIreland
CityDublin
Period22/05/2227/05/22
Internet address

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science Applications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Multilingual Detection of Personal Employment Status on Twitter'. Together they form a unique fingerprint.

Cite this