Michigan Tech Publications

SIAMESE: Stealing Fine-Tuned Visual Foundation Models via Diversified Prompting

Madhureeta Das, Michigan Technological UniversityFollow
Gaurav Bagwe, Clemson University College of Engineering, Computing and Applied Sciences
Miao Pan, Cullen College of Engineering
Kaichen Yang, Michigan Technological UniversityFollow
Xiaoyong Yuan, Clemson University College of Engineering, Computing and Applied Sciences
Lan Zhang, Clemson University College of Engineering, Computing and Applied Sciences

Document Type

Conference Proceeding

Publication Date

12-3-2025

Department

Department of Electrical and Computer Engineering

Abstract

Visual foundation models, characterized by their robust generalization and adaptability, serve as the basis for a wide array of downstream tasks. When fine-tuned for specific tasks, these models encapsulate confidential and valuable task-specific knowledge, making them prime targets for model stealing (MS) attacks. While recent efforts have exposed MS threats in practical scenarios such as data-free and hard-label contexts, these attacks predominantly target traditional victim models trained from scratch. Fine-tuned visual foundation models, pre-trained on vast and diverse datasets and then fine-tuned on downstream tasks, present significant challenges for traditional MS attacks to extract task-specific knowledge. In this paper, we introduce an innovative MS attack, named SIAMESE, to steal fine-tuned visual foundation models under black-box, data-free, and hard-label settings. The core approach of SIAMESE involves constructing a stolen model using a foundation model that is efficiently and concurrently fine-tuned with multiple diversified soft prompts. To integrate the knowledge derived from these prompts, we propose a novel and tractable loss function that analyzes the output distributions while enforcing orthogonality among the prompts to minimize interference. Additionally, a unique alignment module enhances SIAMESE by synchronizing interpretations between the victim and stolen models. Extensive experiments validate that SIAMESE outperforms state-of-the-art baseline attacks over 10% in accuracy, exposing the heightened vulnerability of fine-tuned visual foundation models to MS threats.

Publisher's Statement

Publication Title

Sec 2025 Proceedings of the 2025 10th ACM IEEE Symposium on Edge Computing

ISBN

9798400722387

Recommended Citation

Das, M., Bagwe, G., Pan, M., Yang, K., Yuan, X., & Zhang, L. (2025). SIAMESE: Stealing Fine-Tuned Visual Foundation Models via Diversified Prompting. Sec 2025 Proceedings of the 2025 10th ACM IEEE Symposium on Edge Computing. http://doi.org/10.1145/3769102.3774434
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/2207

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Version

Publisher's PDF

Download

Included in

Electrical and Computer Engineering Commons

COinS

Michigan Tech Publications

SIAMESE: Stealing Fine-Tuned Visual Foundation Models via Diversified Prompting

Document Type

Publication Date

Department

Abstract

Publisher's Statement

Publication Title

ISBN

Recommended Citation

Creative Commons License

Version

Included in

LINKS

Browse

Search

Author Corner

Michigan Tech Publications

SIAMESE: Stealing Fine-Tuned Visual Foundation Models via Diversified Prompting

Authors

Document Type

Publication Date

Department

Abstract

Publisher's Statement

Publication Title

ISBN

Recommended Citation

Creative Commons License

Version

Included in

Share

LINKS

Browse

Search

Author Corner