Log in

Imperial users details

Other users details

No account? details

Information on

Finding a talk details

Adding a talk details

Syndicating talks details

Who we are details

Everything else details

From Sparsity to HBM: HPIPE's Advancements in FPGA CNN Acceleration

Add to your list(s) Download to your calendar using vCal

Mario Doumet
Tuesday 24 October 2023, 14:00-15:00
https://utoronto.zoom.us/j/82613229697.

If you have a question about this talk, please contact George A Constantinides.

HPIPE is a state-of-the-art sparse-aware CNN accelerator meticulously designed for FPG As. Diverging from the conventional approach where generic processing elements collectively handle one layer at a time, HPIPE ’s compiler statically allocates device resources, crafting custom hardware for each layer within the CNN . Furthermore, the HPIPE compiler capitalizes on sparsity, enabling the accelerator to circumvent unnecessary multiplications with weights that closely approximate zero. Recent advancements have extended HPIPE to the AI-optimized Stratix 10 NX, harnessing its innovative tensor block architecture featuring 30 INT8 Multipliers per tensor block, achieving even higher performance. However, HPIPE requires all weights to reside on-chip, thereby introducing memory constraints when dealing with larger networks such as Resnets. To circumvent this limitation, HPIPE has been augmented to partition CNNs across multiple FPG As that communicate via Ethernet, thereby multiplying the amount of available On-Chip memory and DSPs. Lastly, the presentation will delve into the latest efforts aimed at integrating High Bandwidth Memory (HBM) support into HPIPE . This enhancement will not only alleviate memory-related restrictions but also effectively decouple memory and computing power. As a result, HPIPE can be flexibly deployed on smaller FPG As while concurrently accommodating larger networks.

Online: https://utoronto.zoom.us/j/82613229697

Speaker’s Bio: Mario Doumet is an MASc. student in Computer Engineering at the University of Toronto under the supervision of Prof. Vaughn Betz. His research focuses on AI acceleration and dataflow architectures. Over the course of his Master’s, Mario completed an internship at Intel Labs’ Parallel Computing Lab (PCL) working on distributed ML training using FPG As as smart NICs. Prior to starting his Master’s, he completed his B.Eng. at the American University of Beirut, along with a year of research in collaboration with the MIT Media Lab where he helped in the development of the world’s first battery-free wireless underwater camera.

This talk is part of the CAS Talks series.

This talk is included in these lists:

Note that ex-directory lists are not shown.

Log in

Information on

From Sparsity to HBM: HPIPE's Advancements in FPGA CNN Acceleration

This talk is included in these lists:

Other lists

Other talks