ABSTRACT: As machine learning-based decision systems improve rapidly, we are discovering that it is no longer enough for them to perform well on their own. They should also behave nicely towards their predecessors and peers. More nuanced demands beyond accuracy now drive the learning process, including robustness, explainability, transparency, fairness, and now also compatibility and regression minimization. We call this “Graceful AI,'’ because in 2021, when we replace an old trained classifier with a new one, we should expect a peaceful transfer of decision powers. Today, a new model can introduce errors that the old model did not make, despite significantly improving average performance. Such “regression” can break post-processing pipelines, or cause the need to reprocess large amounts of data. How can we train machine learning models to not only minimize the average error, but also minimize “regression”? Can we design and train new learning-based models in a manner that is compatible with previous ones, so that it is not necessary to re-process any data? These problems are prototypical of the nascent field of cross-model compatibility in representation learning. I will describe the first approach to Backward-Compatible Training (BCT), introduced at the last Conference on Computer Vision and Pattern Recognition (CVPR), and an initial solution to the problem of Positive-Congruent Training (PC-Training), a first step towards “regression constrained” learning, to appear at the next CVPR. Along the way, I will also introduce methodological innovations that enable full-network fine-tuning by solving a linear-quadratic optimization. Such Linear-Quadratic Fine-Tuning (LQF, also to appear at the next CVPR) achieves performance equivalent to non-linear fine-tuning, and superior in the low-data regime, while allowing easy incorporation of convex constraints.
Stefano Soatto is Vice President of Applied Science at Amazon Web Services AI, where he oversees research for AI Applications including vision (Custom Labels, Lookout4Vision), speech (Amazon Transcribe), natural language (Amazon Comprehend, Amazon Lex, Amazon Kendra, Amazon Translate), Document Understanding (Amazon Textract), time series analysis (Amazon Forecast, Lookout4Metrics, Lookout4Equipment), personalization (Amazon Personalize) and others in the works. He is also a Professor of Computer Science at UCLA and founding director of the UCLA Vision Lab.