Abstract
The standard training regime for transition-based dependency parsers makes use of an oracle, which predicts an optimal transition sequence for a sentence and its gold tree. We present an improved oracle for the arc-eager transition system, which provides a set of optimal transitions for every valid parser configuration, including configurations from which the gold tree is not reachable. In such cases, the oracle provides transitions that will lead to the best reachable tree from the given configuration. The oracle is efficient to implement and provably correct. We use the oracle to train a deterministic left-to-right dependency parser that is less sensitive to error propagation, using an online training procedure that also explores parser configurations resulting from non-optimal sequences of transitions. This new parser outperforms greedy parsers trained using conventional oracles on a range of data sets, with an average improvement of over 1.2 LAS points and up to almost 3 LAS points on some data sets.
Original language | English |
---|---|
Pages | 959-976 |
Number of pages | 18 |
State | Published - 1 Jan 2012 |
Externally published | Yes |
Event | 24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India Duration: 8 Dec 2012 → 15 Dec 2012 |
Conference
Conference | 24th International Conference on Computational Linguistics, COLING 2012 |
---|---|
Country/Territory | India |
City | Mumbai |
Period | 8/12/12 → 15/12/12 |
Keywords
- Dependency parsing
- Oracle
- Transition system
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Computational Theory and Mathematics
- Linguistics and Language