We look forward to seeing you in person on Monday, April 10, at 4:30pm. Join in person at 2124 Siebel Center for Computer Science, 201 N. Goodwin Ave or via zoom, https://illinois.zoom.us/j/83675834345?pwd=T1l6aXdzK3lOdnNmVUtjZjFzdHZsdz09
Title: Dias: Dynamic Rewriting of Pandas Code
Author(s): Stefanos Baziotis, Daniel Kang, Charith Mendis
Abstract: In recent years, dataframe libraries, such as pandas have exploded in popularity. Due to their flexibility, they are increasingly used in ad-hoc exploratory data analysis (EDA) workloads. These workloads are diverse, including custom functions which can span libraries or be written in pure Python. The majority of systems available to accelerate EDA workloads focus on bulk-parallel workloads, which contain vastly different computational patterns, typically within a single library. As a result, they can introduce excessive overheads for ad-hoc EDA workloads due to their expensive optimization techniques. Instead, we identify program rewriting as a lightweight technique which can offer substantial speedups while also avoiding slowdowns. We implemented our techniques in Dias, which rewrites notebook cells to be more efficient for ad-hoc EDA workloads. We develop techniques for efficient rewrites in Dias, including dynamic checking of preconditions under which rewrites are correct and just-in-time rewrites for notebook environments. We show that Dias can rewrite individual cells to be 57× faster compared to pandas and 1909× faster compared to optimized systems such as modin. Furthermore, Dias can accelerate whole notebooks by up to 3.6× compared to pandas and 26.4× compared to modin.
Bio: Stefanos is a second-year CS PhD student, at the University of Illinois Urbana-Champaign (UIUC), in the Dept. of Computer Science (CS). His advisor is Prof. Charith Mendis. He is doing research at the intersection of compilers and data management systems. More broadly, his interests extend to performance engineering, programming languages, and distributed systems. Before coming to UIUC, Stefanos worked as a compiler researcher at NEC and was also involved with the Liberty Research Group from Princeton. He obtained his B.Sc. from the Department of Informatics, University of Athens, where he did his thesis with Dr. Yannis Smaragdakis.