National Center for Supercomputing Applications WordPress Master Calendar

View Full Calendar

NCSA staff who would like to submit an item for the calendar can email newsdesk@ncsa.illinois.edu.

CS Compiler Seminar: Mangpo Phothilimthana , "Datacenter Scale Autotuning for ML Workloads"

Event Type

Social/Informal Event

Sponsor

Josep Torrellas

Location

2124 Siebel Center

Virtual

Join online

Date

Oct 10, 2022 4:30 pm

Views

164

Originating Calendar

Siebel School Speakers Calendar

Date: Monday, October 10

Time: 4:30pm - 5:30pm

Location: The seminar is hybrid. You can attend in-person at Room 2124 (Siebel), or you can join via Zoom (https://illinois.zoom.us/j/89400978467?pwd=NjV3ZFQrQ1JidTNyS0ZNUEVOcEtpUT09)

Speaker(s): Mangpo Phothilimthana (External Speaker)

Title: Datacenter Scale Autotuning for ML Workloads

Abstract:

Search-based techniques have been demonstrated effective in solving complex optimization problems that arise in domain-specific compilers for machine learning (ML). Unfortunately, deploying such techniques in production compilers is impeded by several limitations. In this talk, I will present an autotuner for production ML compilers that can tune both graph-level and subgraph-level optimizations at multiple compilation stages. We demonstrate how to incorporate machine learning techniques such as a learned cost model to reduce autotuning time. Our learned cost model has high accuracy and outperforms a heavily optimized analytical performance model. In an evaluation across 150 ML training and inference models on Tensor Processing Units (TPUs), the autotuner offers up to 2.4x and an average 5% runtime speedup over the heavily optimized XLA compiler.

In the second part of the talk, I will outline how we deploy the XLA autotuner at datacenter scale to automatically tune the most heavily used production models in Google’s fleet every day. The deployed tile size and flag autotuners have been saving approximately 2% of fleetwide TPU compute time. I will also share some of the challenges we experienced from deploying the autotuner in production, including the accuracy of runtime estimation of a graph, numerical issues, and compiler bugs.

Bio:

Mangpo is a research scientist at Google Brain, where she leads Machine Learning for Machine Learning Compilers effort (one of Google Brain moonshots in 2020). She is also involved in various research projects that apply program languages techniques for machine learning. Her research interests include compilers, machine learning for systems, program synthesis, and energy-aware computing. Mangpo completed a PhD in Computer Science at UC Berkeley. Her previous research focuses on synthesis-aided compilation and programming models for emerging architectures, ranging from an ultra-low-power processor to a programmable network card. She was a recipient of Microsoft Research PhD Fellowship and Qualcomm Innovation Fellowship.

link for robots only