Imperial College London > Talks@ee.imperial > CAS Talks > Custom precision FPGA and GPU Multiple Precision Training (MuPT) in Caffe

Custom precision FPGA and GPU Multiple Precision Training (MuPT) in Caffe

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact George A Constantinides.

MuPT allows for training at low power through the utilization of reduced precision matrix multiplication calculation at the earlier stages of CNN training. This work discusses the design process of the FPGA hardware including the various stages of optimization regarding resource utilization, scalability, memory bottlenecks and caffe integration.

Due to the variable sizes of the layers in various networks the implementation of the matrix mutliply requires tiling as well as a shift register based systolic array implementation described in Vivado HLS . All CPU -FPGA communication has been implemented through the Xilinx OpenCL API .

This talk is part of the CAS Talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Changes to Talks@imperial | Privacy and Publicity