Imperial College London > Talks@ee.imperial > CAS Talks > ARC2010: Optimising Memory Bandwidth Use for Matrix-Vector Multiplication in Iterative Methods

ARC2010: Optimising Memory Bandwidth Use for Matrix-Vector Multiplication in Iterative Methods

Add to your list(s) Download to your calendar using vCal

  • UserDavid P Boland (Imperial College)
  • ClockWednesday 03 March 2010, 14:00-14:30
  • HouseRoom 503/EEE.

If you have a question about this talk, please contact George A Constantinides.

Computing the solution to a system of linear equations is a fundamental problem in scientific computing, and its acceleration has drawn wide interest in the FPGA community [1–3]. One class of algo- rithms to solve these systems, iterative methods, has drawn particular interest, with recent literature showing large performance improvements over general purpose processors (GPPs). In several iterative methods, this performance gain is largely a result of parallelisation of the matrix- vector multiplication, an operation that occurs in many applications and hence has also been widely studied on FPG As [4, 5]. However, whilst the performance of matrix-vector multiplication on FPG As is generally I/O bound [4], the nature of iterative methods allows the use of on- chip memory buffers to increase the bandwidth, providing the potential for significantly more parallelism [6]. Unfortunately, existing approaches have generally only either been capable of solving large matrices with limited improvement over GPPs [4–6], or achieve high performance for relatively small matrices [2,3]. This paper proposes hardware designs to take advantage of symmetrical and banded matrix structure, as well as methods to optimise the RAM use, in order to both increase the perfor- mance and retain this performance for larger order matrices.

This talk is part of the CAS Talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Changes to Talks@imperial | Privacy and Publicity