Research Seminars @ Illinois

View Full Calendar

Tailored for undergraduate researchers, this calendar is a curated list of research seminars at the University of Illinois. Explore the diverse world of research and expand your knowledge through engaging sessions designed to inspire and enlighten.

To have your events added or removed from this calendar, please contact OUR at ugresearch@illinois.edu

Statistics Seminar - Garvesh Raskutti, University of Wisconsin, Madison "Scalable feature importance estimation, Shapley values and feature selection"

Event Type

Ceremony/Service

Sponsor

Department of Statistics

Location

106B1 Engineering Hall

Date

Oct 2, 2025 3:30 pm

Views

Originating Calendar

Department of Statistics Event Calendar

Title: Scalable feature importance estimation, Shapley values and feature selection

Abstract: Interpreting so-called "black-box" machine learning algorithms has been a problem of great interest recently. A concrete approach for interpretation that is agnostic to the algorithm/method choice is estimating feature importance and performing feature selection. This talk is divided into two parts: the first part focuses on providing a scalable approach for feature importance estimation through fine-tuning of gradient-based approaches to neural networks and gradient boosted decision trees. Theoretical guarantees are provided by exploiting the neural tangent kernel and analysis of early stopping for kernel methods. Brief connections to fine-tuning pre-trained large language models are also discussed. The second part focuses on reliable feature selection by adapting Shapley value estimates to determine whether features are statistically significant. By exploiting connections to DAG-learning, I introduce a MinShap algorithm which empirically is significantly more reliable than state-of-the-art methods such as leave-one-covariate (LOCO) and generalized covariance measure (GCM).

Bio: Garvesh Raskutti is a Professor in the Department of Statistics at UW Madison. His research interests include interpretable machine learning, high-dimensional statistics and graphical models.

link for robots only