Imperial College London > Talks@ee.imperial > CAS Talks > StreamSVD: Low-rank Approximation and Streaming Accelerator Co-design
Log inImperial users Other users No account?Information onFinding a talk Adding a talk Syndicating talks Who we are Everything else |
StreamSVD: Low-rank Approximation and Streaming Accelerator Co-designAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact George A Constantinides. The post-training compression of a Convolutional Neural Network (CNN) aims to produce Pareto-optimal designs on the accuracy-performance frontier when the access to training data is not possible. Low-rank approximation is one of the methods that is often utilised in such cases. However, existing work considers the low-rank approximation of the network and the optimisation of the hardware accelerator separately, leading to systems with sub-optimal performance. This work focuses on the efficient mapping of a CNN into an FPGA device, and presents StreamSVD, a model-accelerator co-design framework. The framework considers simultaneously the compression of a CNN model through a hardware-aware low-rank approximation scheme, and the optimisation of the hardware accelerator’s architecture by taking into account the approximation scheme’s compute structure. Our results show that the co-designed StreamSVD outperforms existing work that utilises similar low-rank approximation schemes by providing better accuracy-throughput trade-off. The proposed framework also achieves competitive performance compared with other post-training compression methods, even outperforming them under certain cases. This talk is part of the CAS Talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsType the title of a new list here Complexity & Networks Group sweatcoinOther talks |