Research Seminars @ Illinois

Tailored for undergraduate researchers, this calendar is a curated list of research seminars at the University of Illinois. Explore the diverse world of research and expand your knowledge through engaging sessions designed to inspire and enlighten.

To have your events added or removed from this calendar, please contact OUR at ugresearch@illinois.edu

Computer Vision Seminar Series: Boqing Gong, "BabyVLM: Democratizing Research on the Pretraining of Vision Large Language Models."

May 1, 2026   4:00 - 5:00 pm  
0216 Siebel Center
Sponsor
Illinois Computer Vision
Speaker
Dr. Boqing Gong
Contact
Yao Xiao
E-Mail
yaox11@illinois.edu
Originating Calendar
Siebel School Speakers Calendar

Abstract: Pretraining vision foundation models (VFMs) is prohibitively expensive, making it a privilege for institutions with abundant resources and leaving independent researchers to downstream tasks, such as benchmarking, interpreting, and aligning VFMs. This situation is a crisis for computer vision research — “What I cannot create, I do not understand,” quoted Richard Feynman. Independent researchers and the public cannot gain a true understanding, trust, and safe use of VFMs passively from open weights or APIs. Meanwhile, the few privileged VFM creators could momentarily reach a plateau without the broad research community’s nurturing.  

Hence, we propose democratizing VFM pretraining by scaling it down to a developmentally plausible framework that is scientifically reasonable and computationally friendly to university budgets, aiming to promote exploration rather than exploitation of the pretraining and enable independent researchers to build general-purpose VFMs that approach “baby intelligence” to benefit efforts towards “grown-up” AI. This framework will closely mimic the minimal yet highly informative sensory experiences of human infants, encompassing 1) Pretraining data curated from longitudinal, egocentric audiovisual recordings of babies, 2) A suite of developmentally aligned evaluation benchmarks assessing VFM capabilities against cognitive milestones like object permanence, social skills, and language acquisition, and 3) A user-friendly pretraining codebase and baseline models. 

Speaker Bio.: Boqing Gong (https://boqinggong.github.io) is a computer science faculty member at Boston University and a part-time research scientist at Google DeepMind. His research on machine learning and computer vision focuses on visual recognition, video, and AI models’ generalization and efficiency.

link for robots only