Abstract: Although modern conversational agents boast impressive capabilities, they still fall short of the intuitive depth and adaptability of human communication. A key distinction lies in the human ability to infer implicit intent from both spoken language or/and environmental visual cues, effortlessly steering conversations or suggesting relevant tasks and products. This talk focuses on advancing conversational agents toward more human-like interaction, significantly improving both user experience and functionality. We will explore innovative strategies and frameworks that harness commonsense knowledge and the imaginative potential of large pre-trained models across multiple modalities. By enriching training data to address cold-start challenges and tailoring models to specific scenarios or strategies, we aim to create conversational agents that deliver more seamless, context-aware, and user-centric interactions. The ultimate goal is not only to bridge the gap between human and machine communication but also to expand the possibilities for how these agents can enhance everyday life.
Short Bio: Yun-Nung (Vivian) Chen is currently a professor in the Department of Computer Science & Information Engineering at National Taiwan University. She earned her Ph.D. degree from Carnegie Mellon University, where her research interests focus on spoken dialogue systems and natural language processing. She was recognized as the World's Top 2% Scientists in her 2023 impact, the Taiwan Outstanding Young Women in Science and received Google Faculty Research Awards, Amazon AWS Machine Learning Research Awards, MOST Young Scholar Fellowship, and FAOS Young Scholar Innovation Award. Her team was selected to participate in the first Alexa Prize TaskBot Challenge in 2021. Prior to joining National Taiwan University, she worked in the Deep Learning Technology Center at Microsoft Research Redmond.