TY - GEN
T1 - Improved parsing and POS tagging using inter-sentence consistency constraints
AU - Rush, Alexander M.
AU - Reichart, Roi
AU - Collins, Michael
AU - Globerson, Amir
N1 - Funding Information: Received for publication November 14, 2017; accepted July 2, 2018. From the *School of Public Health, Division of Epidemiology and Bio-statistics, University of Illinois at Chicago, Chicago, IL; †Centers for Disease Control and Prevention, Atlanta, GA; and ‡Chicago Department of Public Health, Chicago, IL. Supported by the Centers for Disease Control and Prevention under IPA agreement 15IPA1511782. The authors have no conflicts of interest to disclose. Members of the NHBS Study Group have been listed in the Appendix 1. Supplemental digital content is available for this article. Direct URL citations appear in the printed text and are provided in the HTML and PDF versions of this article on the journal’s Web site (www.jaids.com). Correspondence to: Mary Ellen Mackesy-Amiti, PhD, School of Public Health, University of Illinois at Chicago, MC 923, 1603 West Taylor Street, Chicago, IL 60612 (e-mail: [email protected]). Copyright © 2018 Wolters Kluwer Health, Inc. All rights reserved.
PY - 2012
Y1 - 2012
N2 - State-of-the-art statistical parsers and POS taggers perform very well when trained with large amounts of in-domain data. When training data is out-of-domain or limited, accuracy degrades. In this paper, we aim to compensate for the lack of available training data by exploiting similarities between test set sentences. We show how to augment sentence-level models for parsing and POS tagging with inter-sentence consistency constraints. To deal with the resulting global objective, we present an efficient and exact dual decomposition decoding algorithm. In experiments, we add consistency constraints to the MST parser and the Stanford part-of-speech tagger and demonstrate significant error reduction in the domain adaptation and the lightly supervised settings across five languages.
AB - State-of-the-art statistical parsers and POS taggers perform very well when trained with large amounts of in-domain data. When training data is out-of-domain or limited, accuracy degrades. In this paper, we aim to compensate for the lack of available training data by exploiting similarities between test set sentences. We show how to augment sentence-level models for parsing and POS tagging with inter-sentence consistency constraints. To deal with the resulting global objective, we present an efficient and exact dual decomposition decoding algorithm. In experiments, we add consistency constraints to the MST parser and the Stanford part-of-speech tagger and demonstrate significant error reduction in the domain adaptation and the lightly supervised settings across five languages.
UR - http://www.scopus.com/inward/record.url?scp=84875194264&partnerID=8YFLogxK
M3 - منشور من مؤتمر
SN - 9781937284435
T3 - EMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference
SP - 1434
EP - 1444
BT - EMNLP-CoNLL 2012 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Proceedings of the Conference
T2 - 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, EMNLP-CoNLL 2012
Y2 - 12 July 2012 through 14 July 2012
ER -