Siebel School Master Calendar

Computer Vision Seminar Series: Dr. Shuyang (Kevin) Sun, "D4RT: Teaching AI to see the world in four dimensions."

May 8, 2026   4:00 - 5:00 pm  
Sponsor
Illinois Computer Vision
Speaker
Dr. Shuyang (Kevin) Sun
Contact
Yao Xiao
E-Mail
yaox11@illinois.edu
Views
49
Originating Calendar
Siebel School Speakers Calendar

Abstract: Understanding and reconstructing the complex geometry and motion of dynamic scenes from video remains a formidable challenge in computer vision. This paper introduces D4RT, a simple yet powerful feedforward model designed to efficiently solve this task. D4RT utilizes a unified transformer architecture to jointly infer depth, spatio-temporal correspondence, and full camera parameters from a single video. Its core innovation is a novel querying mechanism that sidesteps the heavy computation of dense, per-frame decoding and the complexity of managing multiple, task-specific decoders. Our decoding interface allows the model to independently and flexibly probe the 3D position of any point in space and time. The result is a lightweight and highly scalable method that enables remarkably efficient training and inference. We demonstrate that our approach sets a new state of the art, outperforming previous methods across a wide spectrum of 4D reconstruction tasks.

Speaker Bio.: Shuyang (Kevin) Sun (https://scholar.google.com/citations?user=PoAvGRMAAAAJ&hl=en) is a Research Scientist at Google DeepMind. His research background spans computer vision, visual perception and understanding. Currently, he focuses on advancing open-world unified visual perception, including recent contributions to 4D spatio-temporal reconstruction and simulation. Prior to joining DeepMind, Shuyang was a Research Scientist at ByteDance. He earned his Ph.D. from the University of Oxford under the supervision of Professor Philip Torr.

link for robots only