Aman Madaan "Language models of code are few-shot reasoners"

Dec 9, 2022 11:00 am

Join online

Seminar/Symposium

Sponsor

Illinois Computer Science

Speaker

Aman Madaan, PhD Candidate, Carnegie Mellon University

Contact

Candice Steidinger

E-Mail

steidin2@illinois.edu

Views

137

Originating Calendar

Siebel School Speakers Calendar

Abstract:

Large language models of code (LLMCs) have shown remarkable performance on tasks like code completion and tackling competitive programming problems. This talk will discuss two recent works that leverage the capabilities of (LLMCs) to solve natural language reasoning tasks. In the first work, CoCoGen, we show that by framing structured commonsense reasoning tasks as code generation tasks, LLMCs can outperform strong natural language models in a few-shot setting.

The second work, Program-Aided Language models (PaL), shows how LLMCs can decompose natural language problems into executable programming steps. This decomposition enables a combination of a language model with a Python interpreter, achieving state-of-the-art results in 12 benchmark datasets from Big-Bench Hard. These results demonstrate the potential of LLMCs to tackle challenging natural language reasoning tasks, and open up exciting possibilities for further exploration.

Bio:

Aman is a Ph.D. student at Carnegie Mellon University's Language Technologies Institute, advised by Professor Yiming Yang. His research lies in the areas of natural language generation and commonsense reasoning. He is particularly interested in feedback-driven generation, and the intersection of code generation and natural language reasoning. Aman has previously worked as a student researcher and collaborator with Google Brain and the Allen Institute of AI, where he focused on understanding and improving large-language models in a few-shot setting.

link for robots only