Imperial College London > Talks@ee.imperial > CAS Talks > Towards accelerating primal-dual interior point method for solving large-scale Semi-Definite Programs
Log inImperial users Other users No account?Information onFinding a talk Adding a talk Syndicating talks Who we are Everything else |
Towards accelerating primal-dual interior point method for solving large-scale Semi-Definite ProgramsAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact George A Constantinides. Semi-Definite Programming (SDP) problems can take 1000s of seconds of compute time on conventional single-core architectures for even modest problem sizes. Parallel approaches using multi-cores, GPUs or FPG As must be investigated to enable scalable solutions for large-scale problems. An SDP accelerator has broad applicability in the non-linear convex optimization domain e.g. FPGA floorplanning. Presently, we focus on performance analysis of the step-length calculation phase within the SDP solve. This phase can take anywhere between 0.2%-15% of total runtime depending on problem sparsity and we must accelerate this component (along with others) to deliver balanced overall speedups. A naive/general implementation using Standard Lanczos (BLAS2) is not easily amenable to parallelization on memory-bound systems. We show how to exploit the nature of our problem structure to prefer the highly-parallel Block Lanczos (BLAS3) technique when designing our parallel implementation. We further show how to customize our implementation to exploit architecture-specific algorithmic blocking for better performance. We finally demonstrate that carefully picking Lanczos block parameter alongwith the architectural block parameter can give a 2X performance improvement on multi-cores to 10X performance improvement on GPUs and FPG As for a range of SDP benchmarks. Eventually, I’m interested in accelerating the entire SDP solver in an architecture-agnostic manner using a combination of high-level frameworks and automated parameter-selection techniques for optimized implementation. In the near future, I am interested in fully developing the FPGA mapping of the step length calculator. This talk is part of the CAS Talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsPowertalk isn_talks@ee.imperial CAS TalksOther talks |