National Center for Supercomputing Applications WordPress Master Calendar

Back to Listing

NCSA staff who would like to submit an item for the calendar can email newsdesk@ncsa.illinois.edu.

CS Compiler Seminar: Mangpo Phothilimthana , "Datacenter Scale Autotuning for ML Workloads"

Event Type
Social/Informal Event
Sponsor
Josep Torrellas
Location
2124 Siebel Center
Virtual
wifi event
Date
Oct 10, 2022   4:30 pm  
Views
84
Originating Calendar
Computer Science Speakers Calendar

Date: Monday, October 10

Time: 4:30pm - 5:30pm

Location:  The seminar is hybrid. You can attend in-person at Room 2124 (Siebel), or you can join via Zoom (https://illinois.zoom.us/j/89400978467?pwd=NjV3ZFQrQ1JidTNyS0ZNUEVOcEtpUT09)

 

Speaker(s): Mangpo Phothilimthana (External Speaker)

Title: Datacenter Scale Autotuning for ML Workloads

  

Abstract: 

Search-based techniques have been demonstrated effective in solving complex optimization problems that arise in domain-specific compilers for machine learning (ML). Unfortunately, deploying such techniques in production compilers is impeded by several limitations. In this talk, I will present an autotuner for production ML compilers that can tune both graph-level and subgraph-level optimizations at multiple compilation stages. We demonstrate how to incorporate machine learning techniques such as a learned cost model to reduce autotuning time. Our learned cost model has high accuracy and outperforms a heavily optimized analytical performance model. In an evaluation across 150 ML training and inference models on Tensor Processing Units (TPUs), the autotuner offers up to 2.4x and an average 5% runtime speedup over the heavily optimized XLA compiler.

In the second part of the talk, I will outline how we deploy the XLA autotuner at datacenter scale to automatically tune the most heavily used production models in Google’s fleet every day. The deployed tile size and flag autotuners have been saving approximately 2% of fleetwide TPU compute time. I will also share some of the challenges we experienced from deploying the autotuner in production, including the accuracy of runtime estimation of a graph, numerical issues, and compiler bugs.

Bio: 

Mangpo is a research scientist at Google Brain, where she leads Machine Learning for Machine Learning Compilers effort (one of Google Brain moonshots in 2020). She is also involved in various research projects that apply program languages techniques for machine learning. Her research interests include compilers, machine learning for systems, program synthesis, and energy-aware computing. Mangpo completed a PhD in Computer Science at UC Berkeley. Her previous research focuses on synthesis-aided compilation and programming models for emerging architectures, ranging from an ultra-low-power processor to a programmable network card. She was a recipient of Microsoft Research PhD Fellowship and Qualcomm Innovation Fellowship.

link for robots only