TY - GEN
T1 - Automatic gloss finding for a knowledge base using ontological constraints
AU - Dalvi, Bhavana
AU - Minkov, Einat
AU - Talukdar, Partha P.
AU - Cohen, William W.
N1 - Publisher Copyright: Copyright © 2015 ACM.
PY - 2015/2/2
Y1 - 2015/2/2
N2 - While there has been much research on automatically constructing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language definitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Automatic Gloss Finding, i.e., assigning glosses to entities in an initially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints. To the best of our knowledge, GLOFIN is the first system for this task. Through extensive experiments on real-world datasets, we demonstrate GLOFIN's effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFIN's robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have made the datasets and code used in this paper publicly available.
AB - While there has been much research on automatically constructing structured Knowledge Bases (KBs), most of it has focused on generating facts to populate a KB. However, a useful KB must go beyond facts. For example, glosses (short natural language definitions) have been found to be very useful in tasks such as Word Sense Disambiguation. However, the important problem of Automatic Gloss Finding, i.e., assigning glosses to entities in an initially gloss-free KB, is relatively unexplored. We address that gap in this paper. In particular, we propose GLOFIN, a hierarchical semi-supervised learning algorithm for this problem which makes effective use of limited amounts of supervision and available ontological constraints. To the best of our knowledge, GLOFIN is the first system for this task. Through extensive experiments on real-world datasets, we demonstrate GLOFIN's effectiveness. It is encouraging to see that GLOFIN outperforms other state-of-the-art SSL algorithms, especially in low supervision settings. We also demonstrate GLOFIN's robustness to noise through experiments on a wide variety of KBs, ranging from user contributed (e.g., Freebase) to automatically constructed (e.g., NELL). To facilitate further research in this area, we have made the datasets and code used in this paper publicly available.
KW - Gloss finding
KW - Hierarchical learning
KW - Web mining.
UR - http://www.scopus.com/inward/record.url?scp=84928726005&partnerID=8YFLogxK
U2 - 10.1145/2684822.2685288
DO - 10.1145/2684822.2685288
M3 - Conference contribution
T3 - WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining
SP - 369
EP - 378
BT - WSDM 2015 - Proceedings of the 8th ACM International Conference on Web Search and Data Mining
PB - Association for Computing Machinery
T2 - 8th ACM International Conference on Web Search and Data Mining, WSDM 2015
Y2 - 31 January 2015 through 6 February 2015
ER -