Domain adaptation of a dependency parser with a class-class selectional preference model

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

When porting parsers to a new domain, many of the errors are related to wrong attachment of out-of-vocabulary words. Since there is no available annotated data to learn the attachment preferences of the target domain words, we attack this problem using a model of selectional preferences based on domainspecific word classes. Our method uses Latent Dirichlet Allocations (LDA) to learn a
domain-specific Selectional Preference model in the target domain using un-annotated data. The model provides features that model the affinities among pairs of words in the domain. To incorporate these new features in the
parsing model, we adopt the co-training approach and retrain the parser with the selectional preferences features. We apply this method for adapting Easy First, a fast nondirectional parser trained on WSJ, to the biomedical domain (Genia Treebank). The Selectional Preference features reduce error by
4.5% over the co-training baseline
Original languageAmerican English
Title of host publicationProceedings of ACL 2012 Student Research Workshop
EditorsJackie C. K. Cheung, Jun Hatori, Carlos Henriquez, Ann Irvine
Place of PublicationJeju Island, Korea
PublisherAssociation for Computational Linguistics (ACL)
Pages43-48
Number of pages6
Edition1st
StatePublished - Jul 2012

Fingerprint

Dive into the research topics of 'Domain adaptation of a dependency parser with a class-class selectional preference model'. Together they form a unique fingerprint.

Cite this