Michigan Tech Publications

Enabling Memory-Augmented Neural Networks for Efficient Edge Applications

Elias Khatami Zenozian, College of Engineering
Mehdi Kamal, Michigan Technological University
Ali Afzali-Kusha, College of Engineering
Massoud Pedram, Michigan Technological University

Abstract

Deep learning-based networks have achieved remarkable success in machine learning, demonstrating its effectiveness in numerous application domains, including computer vision, natural language processing, and big data analysis [1-3]. The precision of the systems based on deep learning relies on substantial computational resources as well as memory capacities in both the training and inference stages in these systems. More specifically, the use of deep neural networks (DNNs) involves computationally expensive training of the deep model, where millions of parameters are determined through a repeated parameter adjustment and fine-tuning process. The computations require significant memory capacities too. In the inference phase, the model computations for obtaining the output(s) based on the inputs should also be carried out. Again, the computation cost will be very high mainly due to, for example, high dimensionality of the input data (e.g., a high-resolution image or a long text) and significant numbers of tensor computations that need to be performed [4, 5]. In the inference phase, the calculations related to the model evaluation should be repeated. To improve the accuracy, increasingly more complex network architectures may need to be employed, which exacerbates the problem. These all make using DNNs in a hardware-implemented system a challenging task.

This paper has been withdrawn.

Michigan Tech Publications

Enabling Memory-Augmented Neural Networks for Efficient Edge Applications

Abstract

LINKS

Browse

Search

Author Corner