Imperial College London > Talks@ee.imperial > CAS Talks > Separation Logic-Assisted Code Transformations for Efficient High-Level Synthesis
Log inImperial users Other users No account?Information onFinding a talk Adding a talk Syndicating talks Who we are Everything else |
Separation Logic-Assisted Code Transformations for Efficient High-Level SynthesisAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact George A Constantinides. The capabilities of modern FPG As permit the mapping of increasingly complex applications into reconfigurable hardware. High-level synthesis promises a significant shortening of the FPGA design cycle by raising the abstraction level of the design entry to high-level languages such as C/C++. Applications using dynamic, pointer-based data structures and dynamic memory allocation, however, remain difficult to implement well, yet such constructs are widely used in software. Automated optimizations that aim to leverage the increased memory bandwidth of FPG As by distributing the application data over separate banks of on-chip memory are often ineffective in the presence of dynamic data structures, due to the lack of an automated analysis of pointer-based memory accesses. In this work, we take a step towards closing this gap. We present a static analysis for pointer-manipulating programs which automatically splits heap-allocated data structures into disjoint, independent regions. The analysis leverages recent advances in separation logic, a theoretical framework for reasoning about heap-allocated data which has been successfully applied in recent software verification tools. Our algorithm focuses on dynamic data structures accessed in loops and is accompanied by automated source-to-source transformations which enable automatic loop parallelization and memory partitioning by off-the-shelf HLS tools. We demonstrate the successful loop parallelization and memory partitioning by our tool flow using three real-life applications which build, traverse, update and dispose dynamically allocated data structures. Our automated parallelization achieves an average latency reduction by a factor of 2:5 for our benchmark applications. This talk is part of the CAS Talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsType the title of a new list here Type the title of a new list here Type the title of a new list hereOther talks |