Please note that this talk will be on a Thursday.
Talk title: Syntactic productivity in people and large language models
Abstract: Large computational neural network models of language (LLMs) have proven able to model highly sophisticated aspects of human language, including in domains traditionally considered to rely on innate knowledge. But LLMs are not built with language-specific biases, raising the question of how they succeed on such tasks. I will show that studying LLMs from the perspective of acquisition thus has the ability to clarify assumptions underlying arguments about human language acquisition, to raise new questions about human linguistic abilities, and to highlight a distinction between knowledge and use of knowledge in the context of language.
In three experiments on the productivity of argument structure alternations (like active to passive), I find that LLMs encode knowledge of selectional preferences of verbs (exp. 0), and use such knowledge to generalize the use of novel words in semantically rich contexts (exp. 1). Such knowledge is well-attested in people.
However, prior literature has not addressed in detail whether people or LLMs generalize in the absence of semantic context, which experiment 2 examines. We find that people, but not LLMs, spontaneously generalize based on structural properties even in the absence of identifiable meaning—even though the LLMs we tested have a bias toward learning structural generalizations. Studying LLMs thus has the potential to address questions about what might not be innate, to raise questions about human linguistic capabilities, and to show the importance of the distinction between knowledge and use of knowledge. It also raises new questions regarding the importance of the role of child-directed speech in acquisition, the relationship between syntactic productivity and meaning, and examining similarities between people and LLMs' linguistic performance in fine-grained detail.