Michigan Tech Publications

A Generalize Hardware Debugging Approach for Large Language Models Semi-Synthetic, Datasets

Weimin Fu, Carl R. Ice College of Engineering
Shijie Li, University of Science and Technology of China
Yifang Zhao, University of Science and Technology of China
Kaichen Yang, Michigan Technological UniversityFollow
Xuan Zhang, Northeastern University
Yier Jin, University of Science and Technology of China
Xiaolong Guo, Carl R. Ice College of Engineering

Document Type

Article

Publication Date

11-26-2024

Department

Department of Electrical and Computer Engineering

Abstract

Large Language Models (LLMs) have precipitated emerging trends towards intelligent automation. However, integrating LLMs into the hardware debug domain encounters challenges: the datasets for LLMs for hardware are often plagued by a dual dilemma - scarcity and subpar quality. Traditional hardware debug approaches that rely on experienced labor to generate detailed prompts are not cheaply scalable. Similarly, strategies that depend on existing LLMs and randomly generated prompts fail to achieve sufficient reliability. We propose a directed, semi-synthetic data synthetic method that leverages version control information and journalistic event descriptions. To produce high-quality data, this approach utilizes version control data from hardware projects combined with the 5W1H (Who, What, When, Where, Why, How) journalistic principles. It facilitates the linear scaling of dataset volumes without depending on skilled labor. We have implemented this method on a collected dataset of open-source hardware designs and fine-tuned fifteen general-purpose LLMs to enable their capability in hardware debugging tasks, thereby validating the efficacy of our approach.

Publication Title

IEEE Transactions on Circuits and Systems I: Regular Papers

Recommended Citation

Fu, W., Li, S., Zhao, Y., Yang, K., Zhang, X., Jin, Y., & Guo, X. (2024). A Generalize Hardware Debugging Approach for Large Language Models Semi-Synthetic, Datasets. IEEE Transactions on Circuits and Systems I: Regular Papers. http://doi.org/10.1109/TCSI.2024.3487486
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/1315

Link to Full Text

COinS

Michigan Tech Publications

A Generalize Hardware Debugging Approach for Large Language Models Semi-Synthetic, Datasets

Document Type

Publication Date

Department

Abstract

Publication Title

Recommended Citation

LINKS

Browse

Search

Author Corner

Michigan Tech Publications

A Generalize Hardware Debugging Approach for Large Language Models Semi-Synthetic, Datasets

Authors

Document Type

Publication Date

Department

Abstract

Publication Title

Recommended Citation

Share

LINKS

Browse

Search

Author Corner