Abstract: In this talk, I present my lab’s work in building models that can follow natural language instructions to perform a wide range of tasks. These models, known as "instruction-tuned" language models, have demonstrated the ability to generalize to new tasks. I first present our meta-dataset – called Super-Natural Instructions – that includes 1,600 NLP tasks and their descriptions to evaluate cross-task generalization. Later, I present our new work, Self-Instruct that improves the instruction-following capabilities of language models without relying on extensive manual annotation. It does this by using the model's own generations to create a large collection of instructional data. Finally, I present how these models can be augmented with retrieval to improve the factuality of the generated responses.
Bio: Hanna Hajishirzi is a Torode Family Associate Professor at UW CSE and a Senior Director at AI2. Her research spans different areas in NLP and AI, focusing on developing machine learning algorithms that represent, comprehend, and reason about diverse forms of data at large scale. Honors include the NSF CAREER Award, Sloan Fellowship, Allen Distinguished Investigator Award, Intel rising star award, a best paper and honorable mention paper awards, and several industry research faculty awards. Hanna received her PhD from University of Illinois at Urbana-Champaign and spent a year as a postdoc at Disney Research and CMU.