Imperial College London > Talks@ee.imperial > CAS Talks > Slow and Steady: Measuring and Tuning Multicore Interference
Log inImperial users Other users No account?Information onFinding a talk Adding a talk Syndicating talks Who we are Everything else |
Slow and Steady: Measuring and Tuning Multicore InterferenceAdd to your list(s) Download to your calendar using vCal
If you have a question about this talk, please contact John Wickerson. Practice Talk for RTAS Now ubiquitous, multicore processors provide replicated compute cores that allow independent programs to run in parallel. However, shared resources, such as last-level caches, can cause otherwise-independent programs to interfere with one another, leading to significant and unpredictable effects on their execution time. Indeed, prior work has shown that specially crafted enemy programs can cause software systems of interest to experience orders-of-magnitude slowdowns when both are run in parallel on a multicore processor. This undermines the suitability of these processors for tasks that have real-time constraints. In this work, we explore the design and evaluation of techniques for empirically testing interference using enemy programs, with an eye towards reliability (how reproducible the interference results are) and portability (how interference testing can be effective across chips). We first show that different methods of measurement yield significantly different magnitudes of, and variation in, observed interference effects when applied to an enemy process that was shown to be particularly effective in prior work. We propose a method of measurement based on percentiles and confidence intervals, and show that it provides both competitive and reproducible observations. The reliability of our measurements allows us to explore auto-tuning, where enemy programs are further specialised per architecture. We evaluate three different tuning approaches (random search, simulated annealing, and Bayesian optimisation) on five different multicore chips, spanning x86 and ARM architectures. To show that our tuned enemy programs generalise to applications, we evaluate the slowdowns caused by our approach on the AutoBench and CoreMark benchmark suites. We observe statistically larger slowdowns compared to those from prior work in 35 out of 105 benchmark/board combinations, and our method achieves a slowdown factor increase of 3.8x compared with prior work the best case. Ultimately, we anticipate that our approach will be valuable for `first pass’ evaluation when investigating which multicore processors are suitable for real-time tasks. This talk is part of the CAS Talks series. This talk is included in these lists:
Note that ex-directory lists are not shown. |
Other listsType the title of a new list here Type the title of a new list here AI- and HCI-related talksOther talksSpeech Synthesis On Statistical Learning for Individualized Decision Making with Complex Data |