Log in

Imperial users details

Other users details

No account? details

Information on

Finding a talk details

Adding a talk details

Syndicating talks details

Who we are details

Everything else details

AutoWS: Automate Weights Streaming in Layer-wise Pipelined DNN Accelerators

Add to your list(s) Download to your calendar using vCal

Zhewen Yu (Imperial College)
Monday 18 March 2024, 13:00-14:00
Department of Electrical and Electronic Engineering, Room 909B.

If you have a question about this talk, please contact George A Constantinides.

With the great success of Deep Neural Networks (DNN), the design of efficient hardware accelerators has triggered wide interest in the research community. Existing research explores two architectural strategies: sequential layer execution and layer-wise pipelining. While the former supports a wider range of models, the latter is favoured for its enhanced customization and efficiency. A challenge for the layer-wise pipelining architecture is its substantial demand for the on-chip memory for weights storage, impeding the deployment of large-scale networks on resource-constrained devices. This paper introduces AutoWS, a pioneering memory management methodology that exploits both on-chip and off-chip memory to optimize weight storage within a layer-wise pipelining architecture, taking advantage of its static schedule. Through a comprehensive investigation on both the hardware design and the Design Space Exploration, our methodology is fully automated and enables the deployment of large-scale DNN models on resource-constrained devices, which was not possible in existing works that target layer-wise pipelining architectures. AutoWS is open-source: https://github.com/Yu-Zhewen/AutoWS

This talk is part of the CAS Talks series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

AutoWS: Automate Weights Streaming in Layer-wise Pipelined DNN Accelerators

This talk is included in these lists:

Other lists

Other talks