Computer Science Speakers Series

Back to Listing

CS Compiler Seminar: Aayan Kumar, "Shiftry: RNN Inference in 2KB of RAM"

Event Type
Seminar/Symposium
Sponsor
Illinois Computer Science, Architecture, Compilers, and Parallel Computing Research Area
Location
https://illinois.zoom.us/j/89400978467?pwd=NjV3ZFQrQ1JidTNyS0ZNUEVOcEtpUT09
Virtual
wifi event
Date
Nov 8, 2021   4:30 pm  
Speaker
Aayan Kumar, Graduate Student, Electrical Engineering and Computer Science, UC Berkeley
Contact
Madeleine Garvey
E-Mail
mgarvey@illinois.edu
Views
26
Originating Calendar
Computer Science Speakers Calendar

Speaker: Aayan Kumar

Title: Shiftry: RNN Inference in 2KB of RAM

Date: Monday, November 8, 2021

Time: 4:30 –5:30PM

Zoom Link:  https://illinois.zoom.us/j/89400978467?pwd=NjV3ZFQrQ1JidTNyS0ZNUEVOcEtpUT09

 

Abstract: Traditionally, IoT devices send collected sensor data to an intelligent cloud where machine learning (ML) inference happens. However, this course is rapidly changing and there is a recent trend to run ML on the edge IoT devices themselves. An intelligent edge is attractive because it saves network round trip (efficiency) and keeps user data at the source (privacy). However, the IoT devices are much more resource constrained than the cloud, which makes running ML on them challenging. Specifically, consider Arduino Uno, a commonly used board, that has 2KB of RAM and 32KB of read-only Flash memory. Although recent breakthroughs in ML have created novel recurrent neural network (RNN) models that provide good accuracy with KB-sized models, deploying them on tiny devices with such hard memory requirements has remained elusive.

We provide, Shiftry, an automatic compiler from high-level floating-point ML models to fixed-point C-programs with 8-bit and 16-bit integers, which have significantly lower memory requirements. For this conversion, Shiftry uses a data-driven float-to-fixed procedure and a RAM management mechanism. These techniques enable us to provide first empirical evaluation of RNNs running on tiny edge devices. On simpler ML models that prior work could handle, Shiftry-generated code has lower latency and higher accuracy. 

 

Bio: Aayan Kumar is a graduate student in the Electrical Engineering and Computer Science department at UC Berkeley being advised by Prof Koushik Sen. His primary research interest is the intersection of Programming Languages and  Machine Learning, and is currently working on using Graph Neural Nets for Program analysis.

Prior to joining grad school he spent an year and a half as a Research Fellow at Microsoft Research, India, mentored by Rahul Sharma. He worked on the EdgeML project which was aimed at bringing machine learning to edge IoT devices. He has also worked as a software engineer at AppDynamics, a startup acquired by Cisco and did his undergraduate in CS at IIT Delhi.

link for robots only