National Center for Supercomputing Applications WordPress Master Calendar

Back to Listing

NCSA staff who would like to submit an item for the calendar can email newsdesk@ncsa.illinois.edu.

Research Short Talk: Daniel Kang & Lui Sha, "AIDB: Querying Unstructured Data via SQL"

Event Type
Seminar/Symposium
Sponsor
Illinois Computer Science
Location
For CS faculty only
Date
Dec 5, 2022   12:00 pm  
Views
4
Originating Calendar
Computer Science Speakers Calendar

Abstract: Analysts and scientists are increasingly interested in automatically analyzing the semantic contents of unstructured, non-tabular data (videos, images, text, and audio). In order to extract the semantic contents, analysts have turned to machine learning (ML) methods, which can be used in unstructured data analytics systems. Unfortunately, using these ML methods requires expertise to deploy and can be incredibly expensive to execute.

 

To address these issues, I have built AIDB, a database for allowing users to query unstructured data via SQL. In AIDB, a database administrator specifies mappings between virtual columns that are generated via ML models. The application user can then query the tables in AIDB as with any other SQL database. I have also developed new optimizations to accelerate these ML-based queries via approximations and new query optimization techniques, which can provide up to 300x speedups at 95% accuracy.

link for robots only