Skip to main navigation Skip to search Skip to main content

Universal dependencies for learner English

  • Yevgeni Berzak
  • , Lucia Lam
  • , Jessica Kenney
  • , Keiko Sophie Mori
  • , Carolyn Spadine
  • , Sebastian Garza
  • , Jing Xian Wang
  • , Boris Katz

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic analyses are provided for both the original and error corrected versions of each sentence. Further on, we delineate ESL annotation guidelines that allow for consistent syntactic treatment of ungrammatical English. Finally, we benchmark POS tagging and dependency parsing performance on the TLE dataset and measure the effect of grammatical errors on parsing accuracy. We envision the treebank to support a wide range of linguistic and computational research on second language acquisition as well as automatic processing of ungrammatical language1.

Original languageEnglish
Title of host publication54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
Pages737-746
Number of pages10
ISBN (Electronic)9781510827585
DOIs
StatePublished - 2016
Externally publishedYes
Event54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Berlin, Germany
Duration: 7 Aug 201612 Aug 2016

Publication series

Name54th Annual Meeting of the Association for Computational Linguistics, ACL 2016 - Long Papers
Volume2

Conference

Conference54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
Country/TerritoryGermany
CityBerlin
Period7/08/1612/08/16

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Universal dependencies for learner English'. Together they form a unique fingerprint.

Cite this