Title: Post: Training Detection of Backdoor Attacks
Abstract: Deep neural network (DNN) classifiers have achieved state-of-the-art performance in many
applications. However, they have also been shown to be vulnerable to adversarial attacks. Backdoor
(Trojan) attack is an important type of adversarial attack that induces test-time misclassification to a
victim DNN classifier by embedding a trigger pattern in test samples. It can be easily launched by
poisoning the classifier’s training set with only a few samples embedded with the same trigger pattern.
A successful backdoor attack will cause negligible degradation to the classifier’s accuracy on clean,
trigger-free samples (e.g. those used for validation). Thus, detection of backdoor attacks is a very
In today’s talk, I will identify three scenarios where defenses against backdoor attacks can be deployed.
In particular, I will focus on the most challenging “post-training” scenario, where the defender is the
user of a down stream app or a legacy system who has no access to the classifier’s training set, yet
wants to detect if the classifier is backdoor attacked. I will first present a reverse-engineering-based
approach with unsupervised detection inference. Then, I will focus on an even more challenging “twoclass”
scenario and present a method using a novel statistic based on the transferability of adversarial
perturbation between samples. Finally, I will conclude with future works in backdoor attacks and
Bio: Zhen Xiang is a final-year PhD student in the Department of Electrical Engineering at Pennsylvania
State University, advised by Prof David J. Miller and Prof George Kesidis. His research interests
include security of machine learning and statistical signal processing. Prior to Pennsylvania State
University, he received his B.Sc. degree in Electronics and Computer Engineering from Hong Kong
University of Science and Technology in 2014, and his M.Sc. degree in Electrical Engineering from
University of Pennsylvania in 2016.
Meeting ID: 848 2046 8861