9:40 AM - 10:00 AM
[3L1-GS-11-03] Backdoor Attacks using the Concepts as a Trigger
Keywords:poisoning attacks, backdoor attacks, AI Reliability
Backdoor attacks are a type of attack against machine learning models. The backdoored model classifies input into the wrong class if the input contains certain triggers (e.g., noise or patterns). In this paper, we propose a backdoor attack using concepts as triggers to clarify the vulnerabilities that machine learning models suffer from and to develop a discussion on increasing the security of machine learning models. The concepts are interpretable attributes contained in a sample. For example, attributes such as hair color and smile are concepts of facial images. In existing research, most triggers are assumed to be artificially generated patterns that do not appear in the physical world. In addition, such poisoning samples look natural and stealthy. In our experiments, we demonstrate that the concept can be leveraged as a trigger by evaluating the attack success rate of the proposed method and its tolerance against existing defense methods.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.