Computer Science Speakers Calendar

View Full Calendar

Aman Madaan "Language models of code are few-shot reasoners"

Event Type
Seminar/Symposium
Sponsor
Illinois Computer Science
Virtual
wifi event
Date
Dec 9, 2022   11:00 am  
Speaker
Aman Madaan, PhD Candidate, Carnegie Mellon University
Contact
Candice Steidinger
E-Mail
steidin2@illinois.edu
Views
45

Abstract: 

Large language models of code (LLMCs) have shown remarkable performance on tasks like code completion and tackling competitive programming problems. This talk will discuss two recent works that leverage the capabilities of (LLMCs) to solve natural language reasoning tasks. In the first work, CoCoGen, we show that by framing structured commonsense reasoning tasks as code generation tasks, LLMCs can outperform strong natural language models in a few-shot setting.

The second work, Program-Aided Language models (PaL), shows how LLMCs can decompose natural language problems into executable programming steps. This decomposition enables a combination of a language model with a Python interpreter, achieving state-of-the-art results in 12 benchmark datasets from Big-Bench Hard. These results demonstrate the potential of LLMCs to tackle challenging natural language reasoning tasks, and open up exciting possibilities for further exploration.

Bio: 

Aman is a Ph.D. student at Carnegie Mellon University's Language Technologies Institute, advised by Professor Yiming Yang. His research lies in the areas of natural language generation and commonsense reasoning. He is particularly interested in feedback-driven generation, and the intersection of code generation and natural language reasoning. Aman has previously worked as a student researcher and collaborator with Google Brain and the Allen Institute of AI, where he focused on understanding and improving large-language models in a few-shot setting.

link for robots only