Spatial perception —the robot’s ability to sense and understand the surrounding environment— is a key enabler for robot navigation, manipulation, and human-robot interaction. Recent advances in perception algorithms and systems have enabled robots to create large-scale geometric maps of unknown environments and detect objects of interest. Despite these advances, a large gap still separates robot and human perception: Humans are able to quickly form a holistic representation of the scene that encompasses both geometric and semantic aspects, are robust to a broad range of perceptual conditions, and are able to learn without low-level supervision. This talk discusses recent efforts to bridge these gaps. First, we show that scalable metric-semantic scene understanding requires hierarchical representations; these hierarchical representations, or 3D scene graphs, are key to efficient storage and inference, and enable real-time perception algorithms. Second, we discuss progress in the design of certifiable algorithms for robust estimation; we discuss the notion of “estimation contracts”, which provide first-of-a-kind performance guarantees for estimation problems arising in robot perception. Finally, we observe that certification and self-supervision are twin challenges, and the design of certifiable perception algorithms enables a natural self-supervised learning scheme; we apply this insight to 3D object pose estimation and present self-supervised algorithms that perform on par with state-of-the-art, fully supervised methods, while not requiring manual 3D annotations. |
Luca Carlone is the Boeing Career Development Associate Professor in the Department of Aeronautics and Astronautics at the Massachusetts Institute of Technology, and a Principal Investigator in the Laboratory for Information & Decision Systems (LIDS). He received his PhD from the Polytechnic University of Turin in 2012. He joined LIDS as a postdoctoral associate (2015) and later as a Research Scientist (2016), after spending two years as a postdoctoral fellow at the Georgia Institute of Technology (2013-2015). His research interests include nonlinear estimation, numerical and distributed optimization, and probabilistic inference, applied to sensing, perception, and decision-making in single and multi-robot systems. His work includes seminal results on certifiably correct algorithms for localization and mapping, as well as approaches for visual-inertial navigation and distributed mapping. He is a recipient of the 2022 and the 2017 Transactions on Robotics King-Sun Fu Memorial Best Paper Award, the Best Student Paper Award at IROS 2021, the Best Paper Award in Robot Vision at ICRA 2020, a 2020 Honorable Mention from the IEEE Robotics and Automation Letters, a Track Best Paper award at the 2021 IEEE Aerospace Conference, the Best Paper Award at WAFR 2016, the Best Student Paper Award at the 2018 Symposium on VLSI Circuits, and he was best paper finalist at RSS 2015, RSS 2021, and WACV 2023. He is also a recipient of the AIAA Aeronautics and Astronautics Advising Award (2022), the NSF CAREER Award (2021), the RSS Early Career Award (2020), the Sloan Research Fellowship (2023), the Google Daydream Award (2019), the Amazon Research Award (2020, 2022), and the MIT AeroAstro Vickie Kerrebrock Faculty Award (2020). He is an IEEE senior member and an AIAA associate fellow. At MIT, he teaches “Robotics: Science and Systems,” the introduction to robotics for MIT undergraduates, and he created the graduate-level course “Visual Navigation for Autonomous Vehicles”, which covers mathematical foundations and fast C++ implementations of spatial perception algorithms for drones and autonomous vehicles. |