NCSA staff who would like to submit an item for the calendar can email newsdesk@ncsa.illinois.edu.
Abstract: The recent development of generative models (e.g., LLMs) poses new challenges for evaluation that the research community and industry are grappling with. While the versatile capabilities of these models spark excitement, they also inevitably make a leap toward homogenization: powering a wide range of applications with a single, often referred to as "general-purpose", model. How do we narrow the inherent gap between the human requirements in the context of technology deployment and what such a can offer? In this talk, I argue the role of human-centered evaluation in addressing the socio-technical gap. I will first reflect on the disconnect between current evaluation practices and actual human needs and then discuss how those practices can benefit from HCI methods. Next, I will present conceptual frameworks to articulate design decisions in benchmarking and a toolkit to evaluate evaluation metrics. In my final remarks, I will share our current efforts and discuss future opportunities to develop human-centered evaluation methods for generative models.
Speaker Bio: Ziang Xiao is an assistant professor at Johns Hopkins University's Department of Computer Science. His research is motivated by the fundamental question of understanding humans at scale—e.g., how can we conduct robust and generalizable studies about human behavior? The goal of his work is to enhance human-AI interaction to expand our knowledge of ourselves. Through his research, Ziang aims to create a more connected research community and democratize novel technologies to operationalize intuitions and curiosities about how we think and behave. His current research focuses on three exciting topics: AI for social science, human-centered model evaluation, and information seeking. Broadly, Ziang's work lies at the intersection of human-computer interaction, natural language processing, and social and personality psychology. He completed his PhD in computer science at the University of Illinois Urbana-Champaign. Ziang was a postdoctoral researcher in the Fairness, Accountability, Transparency, and Ethics group at Microsoft Research Montréal.