StreamSVD: Low-rank approximation and streaming accelerator co-design

Yu, Z; Bouganis, C-S

Repository landing page

oai:spiral.imperial.ac.uk:10044/1/104864

StreamSVD: Low-rank approximation and streaming accelerator co-design

Authors: Z Yu
C-S Bouganis
Publication date: 1 November 2021
Publisher: IEEE
Doi

Abstract

The post-training compression of a Convolutional Neural Network (CNN) aims to produce Pareto-optimal designs on the accuracy-performance frontier when the access to training data is not possible. Low-rank approximation is one of the methods that is often utilised in such cases. However, existing work considers the low-rank approximation of the network and the optimisation of the hardware accelerator separately, leading to systems with sub-optimal performance. This work focuses on the efficient mapping of a CNN into an FPGA device, and presents StreamSVD, a model-accelerator co-design framework 1 . The framework considers simultaneously the compression of a CNN model through a hardware-aware low-rank approximation scheme, and the optimisation of the hardware accelerator's architecture by taking into account the approximation scheme's compute structure. Our results show that the co-designed StreamSVD outperforms existing work that utilises similar low-rank approximation schemes by providing better accuracy-throughput trade-off. The proposed framework also achieves competitive performance compared with other post-training compression methods, even outperforming them under certain cases

Conference Paper

Similar works

Full text

Open in the Core reader

Download PDF

Spiral - Imperial College Digital Repository

oai:spiral.imperial.ac.uk:1004...

Last time updated on 19/06/2023Provided by our Supporting member

This paper was published in Spiral - Imperial College Digital Repository.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.