We consider a dependency-parsed text corpus as an instance of a labeled directed graph, where nodes represent words and weighted directed edges represent the syntactic relations between them. We show that graph walks, combined with existing techniques of supervised learning that model local and global information about the graph walk process, can be used to derive a task-specific word similarity measure in this graph. We also propose and evaluate a new learning method in this framework, a path-constrained graph walk variant, in which the walk process is guided by high-level knowledge about meaningful edge sequences (paths) in the graph. Empirical evaluation on the tasks of named entity coordinate term extraction and general word synonym extraction show that this framework is preferable to, or competitive with, vector-based models when learning is applied, and using small to moderate size text corpora.
All Science Journal Classification (ASJC) codes
- Language and Linguistics
- Linguistics and Language
- Artificial Intelligence