National Center for Supercomputing Applications WordPress Master Calendar

View Full Calendar

NCSA staff who would like to submit an item for the calendar can email newsdesk@ncsa.illinois.edu.

Vision Seminar: Justin Johnson, "Scalable Supervision for Semantic and Geometric Vision"

Event Type
Seminar/Symposium
Sponsor
Svetlana Lazebnik
Virtual
wifi event
Date
May 10, 2022   10:00 am  
Views
51
Originating Calendar
Computer Science Speakers Calendar

Zoom: https://illinois.zoom.us/j/86238233298?pwd=Y29EWXRPOWtiZ09DczRYMXJZK3JRUT09

Meeting ID: 862 3823 3298

Password: 684009

 

Title: Scalable Supervision for Semantic and Geometric Vision

 

Abstract: Pairing deep neural networks with large training datasets has led to massive advances on a wide variety of vision tasks in the past decade. To continue scaling to larger and more complex data, we must develop scalable forms of supervision that do not rely on explicit human annotation. In contrast to generic unsupervised learning, my work aims to take advantage of additional forms of supervision natural for the task at hand. For semantic vision tasks, I will argue that paired vision+language data is an effective form of supervision that can be acquired at scale from the web. To this end, I will discuss our VirTex method for learning visual features from text, as well as our large-scale RedCaps dataset of image and text data. For geometric tasks, I will argue that the 3D structure of the world can be used for auxiliary supervision, and in particular that differentiable rendering is a core tool for bridging 2D data and 3D tasks without supervision. I will show how these ideas can be applied to a variety of 3D vision tasks including shape prediction, novel view synthesis, and point cloud registration.

 

Bio: Justin Johnson is an Assistant Professor of Computer Science and Engineering at the University of Michigan, Ann Arbor and a Visiting Scientist at Facebook AI Research. He completed his PhD at Stanford University, advised by Fei-Fei Li. His research interests lie primarily in computer vision and include visual reasoning, vision and language, 3D perception, and differentiable rendering.

link for robots only