Computer Science Department Master Calendar

View Full Calendar

Sanjeev Khanna "Sublinear Algorithms for Hierarchical Clustering"

Event Type
Seminar/Symposium
Sponsor
Illinois Computer Science
Location
In person 4403 Siebel Center for Computer
Date
Nov 14, 2022   10:00 am  
Speaker
Sanjeev Khanna, Henry Salvatori Professor of Computer and Information Science, University of Pennsylvania
Contact
Candice Steidinger
E-Mail
steidin2@illinois.edu
Phone
217-300-8564
Views
36
Originating Calendar
Computer Science Speakers Calendar

Abstract: 

Hierarchical clustering is a popular technique for organizing data as a rooted tree structure that simultaneously clusters data at multiple levels of granularity. A well-studied recent objective function views the input as a weighted graph with edges indicating similarity between the data points, and focuses on finding a tree that minimizes the cost of hierarchical partitioning.

The resulting optimization problem is NP-hard, and previous algorithms for approximating this objective require at least linear time/space. In this talk, we will consider algorithms for hierarchical clustering that use sublinear resources (space, time, and communication).

Specifically, we will present sublinear algorithms for hierarchical clustering in the streaming model (space), in the query model (time), and in the MPC model (communication). At the core of our algorithmic results is a connection between hierarchical clustering and a suitably relaxed notion of cut sparsifiers of graphs that lends itself to efficient sublinear algorithms. We complement our algorithmic results by establishing nearly matching lower bounds that rule out algorithms with better performance guarantees in each of these models.

This is joint work with Arpit Agarwal, Huan Li, and Prathamesh Patil.

link for robots only