Title: Learning 3D Perception via Geometry and Parsimony
Abstract: We live in a structured world, and perceive it in specific ways. In this talk, I’ll argue that approaches that aim to understand the visual world should leverage this in form of inductive biases. I will present a line of work that, by building in the notion that our 2D percepts are projections of an underlying 3D world, can allow us to bypass the need of supervision and learn to infer this 3D, as well as recover 2D to 2D correspondences. I will also show that simply leveraging a prior that complex structures (scenes, objects, etc.) can be thought to have simpler components, we can discover these underlying components, and obtain representations which can more accurately perform downstream tasks e.g. future prediction or robotics manipulation.
Bio: Shubham Tulsiani is a research scientist at Facebook AI Research (FAIR). He received a PhD. in Computer Science from UC, Berkeley under the supervision of Jitendra Malik in 2018. He is interested in enabling machine perception systems to learn about and infer the structure underlying the physical world, and allowing agents to leverage this understanding for interaction.