Michigan Tech Publications

Entity Backdoor Attacks Against Fine-Tuned Models

Ting Yang, Chongqing University
Jinxue Zhao, Chongqing University
Xinyu Lei, Michigan Technological UniversityFollow
Hongyu Huang, Chongqing University
Nankun Mu, Chongqing University
Xu Zhou, College of Computing
Mahabubur Rahman Miraj, Chongqing University

Document Type

Conference Proceeding

Publication Date

7-25-2025

Department

Department of Computer Science

Abstract

Fine-tuning is a training paradigm that allows large models to achieve strong performance on downstream tasks with a small number of samples and minimal training time. However, this study reveals that the models fine-tuned from pre-trained models are vulnerable to a new threat called an entity backdoor attack. Entity backdoor attacks are a new type of backdoor attack that can use arbitrary instances in a given entity to trigger a backdoor attack. More importantly, the arbitrary instances (i.e., poisoned example) in the entity are visually similar to the clean example. For example, an entity backdoor attack can use the husky dog (which belongs to the dog entity) to trigger the stop sign class in the traffic recognition task, but the poisoned example in the training dataset is visually like a stop sign. The advantages of entity backdoor attacks over traditional backdoor attacks are twofold. First, entity backdoor attacks are triggered more stealthily because they do not require a specially defined trigger pattern superimposed on a normal image to trigger the backdoor attack. The instance (e.g., a husky dog) itself is a trigger, using the instance can directly trigger the backdoor attack. Second, the poisoned examples in the training datasets of entity backdoor attacks are more stealthy because we use very small perturbations to generate the poisoned examples, making them hard to distinguish from the clean examples. Experiments on multiple datasets show that systems using fine-tuned models are vulnerable to the threat of entity backdoor attacks.

Publication Title

Lecture Notes in Computer Science

ISBN

9789819500086

Recommended Citation

Yang, T., Zhao, J., Lei, X., Huang, H., Mu, N., Zhou, X., & Miraj, M. (2025). Entity Backdoor Attacks Against Fine-Tuned Models. Lecture Notes in Computer Science, 15863 LNAI, 338-349. http://doi.org/10.1007/978-981-95-0009-3_29
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/1962

Link to Full Text

COinS

Michigan Tech Publications

Entity Backdoor Attacks Against Fine-Tuned Models

Document Type

Publication Date

Department

Abstract

Publication Title

ISBN

Recommended Citation

LINKS

Browse

Search

Author Corner

Michigan Tech Publications

Entity Backdoor Attacks Against Fine-Tuned Models

Authors

Document Type

Publication Date

Department

Abstract

Publication Title

ISBN

Recommended Citation

Share

LINKS

Browse

Search

Author Corner