Enabling Memory-Augmented Neural Networks for Efficient Edge Applications
Document Type
Article
Publication Date
1-1-2025
Abstract
Deep learning-based networks have achieved remarkable success in machine learning, demonstrating its effectiveness in numerous application domains, including computer vision, natural language processing, and big data analysis [1-3]. The precision of the systems based on deep learning relies on substantial computational resources as well as memory capacities in both the training and inference stages in these systems. More specifically, the use of deep neural networks (DNNs) involves computationally expensive training of the deep model, where millions of parameters are determined through a repeated parameter adjustment and fine-tuning process. The computations require significant memory capacities too. In the inference phase, the model computations for obtaining the output(s) based on the inputs should also be carried out. Again, the computation cost will be very high mainly due to, for example, high dimensionality of the input data (e.g., a high-resolution image or a long text) and significant numbers of tensor computations that need to be performed [4, 5]. In the inference phase, the calculations related to the model evaluation should be repeated. To improve the accuracy, increasingly more complex network architectures may need to be employed, which exacerbates the problem. These all make using DNNs in a hardware-implemented system a challenging task.
Publication Title
AI-Enabled Electronic Circuit and System Design: From Ideation to Utilization
ISBN
[9783031714368, 9783031714351]
Recommended Citation
Zenozian, E.,
Kamal, M.,
Afzali-Kusha, A.,
&
Pedram, M.
(2025).
Enabling Memory-Augmented Neural Networks for Efficient Edge Applications.
AI-Enabled Electronic Circuit and System Design: From Ideation to Utilization, 565-605.
http://doi.org/10.1007/978-3-031-71436-8_16
Retrieved from: https://digitalcommons.mtu.edu/michigantech-p2/1645