Escaping Filter-based Adversarial Example Defense: A Reinforcement Learning Approach

Document Type

Article

Publication Date

12-19-2022

Department

Department of Computer Science

Abstract

An adversarial example is a specially-crafted example with subtle and intentional perturbations that causes a machine learning model to make a false classification. A plethora of papers have proposed to use filters to effectively defend against adversarial example attacks. However, we demonstrate that the filter-based defenses may not be reliable in this paper. We develop AEDescaptor, a scheme to escape the filter-based defenses. AEDescaptor uses a specially -crafted policy gradient reinforcement learning algorithm to generate adversarial exam-ples even if the filters are used to interrupt the backpropagation channel (that is used in traditional adversarial example attack algorithms). Furthermore, we design a customized algorithm to reduce the possible action space in policy gradient reinforcement learning to accelerate AEDescaptor training while still ensuring that AEDescaptor generates successful adversarial examples. The intensive experiments demonstrate that AEDescaptor-generated adversarial examples have good performance (in terms of success rate and transferability) to escape the filter-based defenses.

Publication Title

2022 IEEE 19th International Conference on Mobile Ad Hoc and Smart Systems (MASS)

Share

COinS