Dr. Minje Kim
Associate Professor, Indiana University
Joint ECE/CS Seminar
Tuesday, April 4, 2023, 10:00-11:00am
B02 CSL and Online via Zoom
Title: Personalized AI Models for Speech and Audio Signal Processing: Toward Data- and Resource Efficiency
Abstract: This talk highlights recent advancements in the emerging field of personalized speech enhancement. By focusing on an individual user's speech characteristics or acoustic environment, personalized models offer more efficient machine learning inference and improved performance compared to general-purpose models. Additionally, personalization can enhance fairness for underrepresented users in large training datasets. However, personalized speech enhancement presents challenges, such as utilizing personal information from unknown test-time users and addressing privacy concerns in dealing with personal data. In this talk, we will explore machine learning solutions to these issues, such as zero- or few-shot learning approaches, data augmentation and purification, self-supervised learning, and knowledge distillation. These methods can improve data and resource efficiency while achieving desired speech enhancement performance. Additionally, the talk will delve into another emerging area of audio research, neural speech and audio coding, where similar challenges, such as computational complexity and privacy issues, persist. Moreover, given the highly perceptual nature of reconstruction tasks, a purely data-driven machine learning approach is not appropriate. To this end, we will investigate the seamless integration of traditional DSP-based coding technology with neural networks to accomplish our ultimate objective: efficient and personalized coding technology.
Minje Kim is an Associate Professor in the Department of Intelligent Systems Engineering at Indiana University. He leads the Signals and AI Group in Engineering (SAIGE) and is affiliated with the Luddy AI Center, Data Science, Cognitive Science, Statistics, and Center for Machine Learning. Additionally, he works as a Visiting Academic at Amazon Lab126. He earned his Ph.D. in Computer Science from the University of Illinois at Urbana-Champaign. Prior to joining UIUC, he worked as a researcher at ETRI, a national lab in Korea, from 2006 to 2011. Throughout his career, Dr. Kim has focused on developing machine learning models for audio signal processing applications. He has received various awards, including the NSF Career Award (2021), IU Trustees Teaching Award (2021), IEEE SPS Best Paper Award (2020), Google and Starkey's grants for outstanding student papers in ICASSP 2013 and 2014, respectively, and the Richard T. Cheng Endowed Fellowship from UIUC in 2011. He is an IEEE Senior Member and serves on the IEEE Audio and Acoustic Signal Processing Technical Committee (2018-2023). He holds editorial positions such as Senior Area Editor for IEEE/ACM Transactions on Audio, Speech, and Language Processing, Associate Editor for EURASIP Journal of Audio, Speech, and Music Processing, and Consulting Associate Editor for IEEE Open Journal of Signal Processing. Dr. Kim is the General Chair of IEEE WASPAA 2023 and has been a reviewer, program committee member, or area chair for major machine learning and signal processing conferences. As an inventor, he is listed on over 60 patents.