TY - GEN
T1 - Weakly Supervised Text-to-SQL Parsing through Question Decomposition
AU - Wolfson, Tomer
AU - Deutch, Daniel
AU - Berant, Jonathan
N1 - Publisher Copyright: © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings.
PY - 2022/1/1
Y1 - 2022/1/1
N2 - Text-to-SQL parsers are crucial in enabling non-experts to effortlessly query relational data. Training such parsers, by contrast, generally requires expertise in annotating natural language (NL) utterances with corresponding SQL queries. In this work, we propose a weak supervision approach for training text-to-SQL parsers. We take advantage of the recently proposed question meaning representation called QDMR, an intermediate between NL and formal query languages. Given questions, their QDMR structures (annotated by non-experts or automatically predicted), and the answers, we are able to automatically synthesize SQL queries that are used to train text-to-SQL models. We test our approach by experimenting on five benchmark datasets. Our results show that the weakly supervised models perform competitively with those trained on annotated NL-SQL data. Overall, we effectively train text-to-SQL parsers, while using zero SQL annotations.
AB - Text-to-SQL parsers are crucial in enabling non-experts to effortlessly query relational data. Training such parsers, by contrast, generally requires expertise in annotating natural language (NL) utterances with corresponding SQL queries. In this work, we propose a weak supervision approach for training text-to-SQL parsers. We take advantage of the recently proposed question meaning representation called QDMR, an intermediate between NL and formal query languages. Given questions, their QDMR structures (annotated by non-experts or automatically predicted), and the answers, we are able to automatically synthesize SQL queries that are used to train text-to-SQL models. We test our approach by experimenting on five benchmark datasets. Our results show that the weakly supervised models perform competitively with those trained on annotated NL-SQL data. Overall, we effectively train text-to-SQL parsers, while using zero SQL annotations.
UR - http://www.scopus.com/inward/record.url?scp=85137325522&partnerID=8YFLogxK
M3 - منشور من مؤتمر
T3 - Findings of the Association for Computational Linguistics: NAACL 2022 - Findings
SP - 2528
EP - 2542
BT - Findings of the Association for Computational Linguistics
PB - Association for Computational Linguistics (ACL)
T2 - 2022 Findings of the Association for Computational Linguistics: NAACL 2022
Y2 - 10 July 2022 through 15 July 2022
ER -