Siebel School Speaker Series Master Calendar

Abstract: While deep neural networks have achieved large gains in performance on benchmark datasets, their performance often degrades drastically with changes in data distribution encountered during real-world deployment. In this work, through systematic experiments and theoretical analysis, we attempt to understand the key reasons behind such brittleness of neural networks in real-world settings and why fixing these issues is exciting but challenging.

We first hypothesize, and through empirical+theoretical studies demonstrate, that (i) neural network training exhibits "simplicity bias" (SB), where the models learn only the simplest discriminative features and (ii) SB is one of the key reasons behind non-robustness of neural networks. A natural way to fix SB in trained models is by identifying the discriminative features used by the model and learning new features “orthogonal” to the learned feature.

Post-hoc gradient-based attribution methods are regularly used to identify the key discriminative features for a model. But, due to lack of ground truth, a thorough evaluation of even the most basic input gradient attribution method is still missing in literature. Our second contribution is to overcome this challenge through experiments and theory on real and designed datasets. Our results demonstrate that (i) input gradient attribution does NOT highlight correct features on standard models (i.e., trained on original data) but surprisingly, it does highlight correct features on adversarially trained models (i.e., trained using adversarial training) and (ii) "feature leakage", which refers to the phenomenon wherein, given an instance, its input gradients highlight the location of discriminative features in the given instance as well as in other instances that are present in the dataset, is the reason behind why input gradient attribution fails for standard models.

Our work raises more questions than it answers, so we will end with interesting directions for future work.

Prateek

Bio: Prateek Jain is a research scientist at Google Research India and an adjunct faculty member at IIT Kanpur. Earlier, he was a Senior Principal Researcher at Microsoft Research India. He obtained his PhD degree from the Computer Science department at UT Austin and his BTech degree from IIT Kanpur. He works in the areas of large-scale and non-convex optimization, high-dimensional statistics, and ML for resource-constrained devices. He wrote a monograph on Non-convex Optimization in Machine Learning summarizing many of his results in non-convex optimization. Prateek regularly serves on the senior program committee of top ML conferences and is an action editor for JMLR, and an associate editor for SIMODS. He has also won ICML-2007, CVPR-2008 best student paper award and more recently his work on alternating minimization has been selected as the 2020 Best Paper by the IEEE Signal Processing Society.

link for robots only