Computer Science Speaker Series Master Calendar

View Full Calendar

COLLOQUIUM: Simon Du, "Pre-Training Data Selection for Representation Learning"

Event Type
Seminar/Symposium
Sponsor
Illinois Computer Science
Location
HYBRID: 2405 Siebel Center for Computer Science or online
Virtual
wifi event
Date
Feb 26, 2024   3:30 pm  
Views
301
Originating Calendar
Computer Science Colloquium Series

Zoom: https://illinois.zoom.us/j/86064910025?pwd=OUZSYkx1alpsbkl2Kys5MnZDZTljdz09

Refreshments Provided.

Abstract: 
Pre-training datasets are a critical component in recent breakthroughs in artificial intelligence. However, their design has not received the same level of research attention as model architectures or training algorithms. In this presentation, I will discuss our recent work on pre-training data selection for representation learning in the contexts of multi-modal contrastive learning and multi-task representation learning. For multi-modal contrastive learning, we propose a new notion, the Variance Alignment Score (VAS). We demonstrate that by maximizing the VAS as a data selection strategy, we can achieve superior performance on dataset selection benchmarks. For multi-task representation learning, we explore how to select the most relevant pre-training tasks for a target downstream task. We introduce a metric to characterize task relevance and design a new method for actively selecting the most pertinent tasks.

Bio:
Simon S. Du is an assistant professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington. His research interests are broadly in machine learning, such as deep learning, representation learning, and reinforcement learning. Prior to starting as faculty, he was a postdoc at the Institute for Advanced Study. He completed his Ph.D. in Machine Learning at Carnegie Mellon University. Simon's research has been recognized by a Samsung AI Researcher of the Year Award, an NSF CAREER award, an Intel Rising Star Faculty Award, an Nvidia Pioneer Award, a AAAI New Faculty Highlights, a Distinguished Dissertation Award honorable mention from CMU, among others.


Part of the Illinois Computer Science Speakers Series. Faculty Host: Hanghang Tong


Meeting ID: 860 6491 0025 
Passcode: csillinois


If accommodation is required, please email <erink@illinois.edu> or <communications@cs.illinois.edu>. Someone from our staff will contact you to discuss your specific needs



 

 

link for robots only