We look forward to seeing you online on Thursday, 9/7.
Abstract: What is the best paradigm to recognize objects---discriminative inference (fast but potentially prone to shortcut learning) or using a generative model (slow but potentially more robust)? We build on recent advances in generative modeling that turn text-to-image models into classifiers. This allows us to study their behavior and to compare them against discriminative models and human psychophysical data. We report four intriguing properties of diffusion-based generative classifiers: they show a record-breaking human-like shape bias (99% for Imagen), near human-level out-of-distribution accuracy, state-of-the-art alignment with human classification errors, and they understand certain perceptual illusions. Our results indicate that while the current dominant paradigm for modeling human object recognition is discriminative inference, zero-shot generative models approximate human object recognition data surprisingly well.
Bio: Dr. Robert Geirhos is a Research Scientist at Google DeepMind, located in Toronto. Previously, Robert obtained his PhD from the University of Tübingen and the International Max Planck Research School for Intelligent Systems, where he worked with Felix Wichmann, Matthias Bethge and Wieland Brendel. Robert holds a MSc degree in Computer Science, with distinction, and a BSc degree in Cognitive Science from the University of Tübingen. His research has received an Outstanding Paper Award at NeurIPS and Orals at ICLR, NeurIPS, VSS and ICML. Robert aims to develop a better understanding of the hypotheses, biases and assumptions of modern machine vision systems, and to use this understanding to make them more robust, interpretable and reliable.