Grainger College of Engineering, All Events

View Full Calendar

Neural Abstract Interpretation & CodeMind: A Framework to challenge LLMs for Code Reasoning

Event Type
Ceremony/Service
Sponsor
The Department of Computer Science, University of Illinois and FM/SE Seminar
Location
0216 Siebel Center
Virtual
wifi event
Date
Apr 19, 2024   2:00 pm  
Speaker
Shaurya Gomber, UIUC & Changshu Liu, UIUC
Contact
Isha Chaudhary
E-Mail
isha4@illinois.edu
Views
8
Originating Calendar
Computer Science Speakers Calendar

Talk 1

Abstract: Abstract Interpretation is a popular technique for formally analyzing the properties of programs, neural networks, and complex real-world systems by soundly approximating their concrete semantics in an abstract domain. Abstract transformers, functions that map abstract states to abstract states as a result of operations in the concrete domain, are key to the process of abstract interpretation. However, designing efficient abstract transformers for expressive relational domains such as Octagon and Polyhedra is hard, as one needs to carefully balance the fundamental tradeoff between the cost, soundness, and the precision of the transformer for downstream tasks. Further, intricate performance optimizations like octagon decomposition need to be done to make them performant. To solve this and provide an automated way to generate transformers, we propose a data-driven learning approach to generate efficient abstract transformers to ease development costs. Given the inherent complexity of abstract transformers and the proven capability of neural networks to effectively approximate complex functions, the Neural Abstract Interpretation framework provides supervised and unsupervised learning methods to learn neural abstract transformers: neural networks that serve as the abstract transformers. Along with giving us an automated way to generate transformers, these neural transformers can also act as a fast and sometimes even more precise replacement for slow and imprecise hand-crafted transformers. Additionally, these neural transformers are differentiable as opposed to the hand-crafted ones, enabling their use with gradient-guided learning methods, which can be beneficial for tasks that can be posed as learning problems (like invariant generation).

Notes: Speaker's ongoing work

Talk 2

Abstract: Large Language Models (LLMs) have been widely used to automate programming tasks. Their capabilities have been evaluated by assessing code quality through test execution. However, as we will show, success in code synthesis does not imply code reasoning, which is essential to trust LLMs with tasks that involve program analysis, e.g., test generation and debugging. Therefore, we proposed a framework designed to gauge the code reasoning abilities of LLMs through several inductive reasoning tasks. CodeMind currently supports three tasks: Independent Execution Reasoning (IER), Dependent Execution Reasoning (DER), and Specification Reasoning (SR). Our extensive evaluation of ten LLMs across five benchmarks in two different programming languages for two code generation tasks (code synthesis and translation) shows that LLMs, to some degree, can explain the program execution flow, specifically for simple programs and the ones they can correctly generate. However, their performance drops for code with higher complexity, non-trivial logical and arithmetic operators, non-primitive types, and API calls. We observe that, while correlated, code generation abilities do not imply code reasoning: ranking LLMs based on test passing can be very different compared to code reasoning.

References: https://arxiv.org/abs/2402.09664

link for robots only