Computer Science Speaker Series Master Calendar

View Full Calendar

Special Seminar: Qiuqiang Kong, "Al for Sound Understanding: Classification, Detection and Separation"

Event Type
Seminar/Symposium
Sponsor
Illinois Computer Science
Virtual
wifi event
Date
Apr 20, 2023   10:00 am  
Views
113
Originating Calendar
Computer Science Special Seminar Series

Zoom: https://illinois.zoom.us/j/87330439742?pwd=VVJxYzhBSFExMDljQ0pDQmtld01EQT09

Abstract:
AI for sound understanding has been a popular topic in recent days. Audio pattern recognition is an essential topic in the machine learning area, including audio tagging, acoustic scene classification, music classification, speech emotion classification, and sound event detection. Previous audio pattern recognition systems are built on specific datasets with limited durations. We propose pretrained audio neural networks (PANNs) (IEEE SPS Young Author Best Paper 2023) trained on the large-scale AudioSet dataset to address the general audio pattern recognition problem. PANNs have achieved several state-of-the-art performance in downstream audio pattern recognition tasks. Beyond PANNs, we propose a weakly labelled learning framework to address the sound event detection and source separation problems trained with large-scale weakly labelled data only. We propose a universal source separation system to address the computation auditory scene analysis (CASA) problem to automatically detect and separate arbitrary sounds. We further propose to use natural language as queries to separate sounds. We apply the proposed sound understanding techniques to music tasks and build a state-of-the-art piano transcription system and the largest piano dataset GiantMIDI-Piano in the world. We forecast the future works of sound understanding in relation to vision, language, robotics, and security.

Bio:
Qiuqiang Kong received his Ph.D. degree from the University of Surrey, Guildford, UK, in 2019. Following his Ph.D., he joined ByteDance as a research scientist. His research topic includes the classification, detection, separation, and generation of general sounds and music. He was the top 2% scientist in 2021 in “Updated science-wide author databases of standardized citation indicators. He was known for developing pretrained audio neural networks (PANNs) for audio tagging and was awarded the IEEE SPS Young Author Best Paper in 2023. He won the detection and classification of acoustic scenes and events (DCASE) challenge in 2017. He was known for transcribing the largest piano MIDI dataset GiantMIDI-Piano in the world. He has co-authored over 50 papers in journals and conferences, including IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), ICASSP, INTERSPEECH, IJCAI, DCASE, EUSIPCO, LVA-ICA. He has been cited 2588 times, with an H-index of 27 till Feb. 2023. He was a frequent reviewer for world well known journals and conferences, including TASLP, TMM, SPL, TKDD, JASM, EURASIP, Neurocomputing, Neural Networks, ISMIR, CSMT. He assisted with organizing the LVA-ICA 2018 in Guildford, UK and the DCASE 2018 Workshop in Woking, UK. He is serving as a co-editor for the Frontiers in Signal Processing journal.

Faculty Host: Paris Smaragdis

Meeting ID:873 3043 9742  ; Password: csillinois

link for robots only