TY - JOUR
T1 - Large Language Models and the Argument from the Poverty of the Stimulus
AU - Lan, Nur
AU - Chemla, Emmanuel
AU - Katzir, Roni
PY - 2024
Y1 - 2024
N2 - How much of our linguistic knowledge is innate? According to much of theoretical linguistics, a fair amount. One of the best-known (and most contested) kinds of evidence for a large innate endowment is the so-called argument from the poverty of the stimulus (APS). In a nutshell, an APS obtains when human learners systematically make inductive leaps that are not warranted by the linguistic evidence. A weakness of the APS has been that it is very hard to assess what is warranted by the linguistic evidence. Current Artificial Neural Networks appear to offer a handle on this challenge, and a growing literature over the past few years has started to explore the potential implications of such models to questions of innateness. We focus here on Wilcox et al. (2023), who use several different networks to examine the available evidence as it pertains to wh-movement, including island constraints. They conclude that the (presumably linguistically-neutral) networks acquire an adequate knowledge of wh-movement, thus undermining an APS in this domain. We examine the evidence further, looking in particular at parasitic gaps and across-the-board movement, and argue that current networks do not, in fact, succeed in acquiring or even adequately approximating wh-movement from training corpora that roughly correspond in size to the linguistic input that children receive. We also show that the performance of one of the models improves considerably when the training data are artificially enriched with instances of parasitic gaps and across-the-board movement. This finding suggests, albeit tentatively, that the failure of the networks when trained on natural, unenriched corpora is due to the insufficient richness of the linguistic input, thus supporting the APS.
AB - How much of our linguistic knowledge is innate? According to much of theoretical linguistics, a fair amount. One of the best-known (and most contested) kinds of evidence for a large innate endowment is the so-called argument from the poverty of the stimulus (APS). In a nutshell, an APS obtains when human learners systematically make inductive leaps that are not warranted by the linguistic evidence. A weakness of the APS has been that it is very hard to assess what is warranted by the linguistic evidence. Current Artificial Neural Networks appear to offer a handle on this challenge, and a growing literature over the past few years has started to explore the potential implications of such models to questions of innateness. We focus here on Wilcox et al. (2023), who use several different networks to examine the available evidence as it pertains to wh-movement, including island constraints. They conclude that the (presumably linguistically-neutral) networks acquire an adequate knowledge of wh-movement, thus undermining an APS in this domain. We examine the evidence further, looking in particular at parasitic gaps and across-the-board movement, and argue that current networks do not, in fact, succeed in acquiring or even adequately approximating wh-movement from training corpora that roughly correspond in size to the linguistic input that children receive. We also show that the performance of one of the models improves considerably when the training data are artificially enriched with instances of parasitic gaps and across-the-board movement. This finding suggests, albeit tentatively, that the failure of the networks when trained on natural, unenriched corpora is due to the insufficient richness of the linguistic input, thus supporting the APS.
U2 - https://doi.org/10.1162/ling_a_00533
DO - https://doi.org/10.1162/ling_a_00533
M3 - مقالة
SN - 0024-3892
JO - Linguistic Inquiry
JF - Linguistic Inquiry
ER -