Imperial College London > Talks@ee.imperial > CAS Talks > Large Model Training at Wafer-Scale

Large Model Training at Wafer-Scale

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact George A Constantinides.

Large Language Models (LLMs) have made significant progress over the past several years, unlocking many new capabilities and use cases for AI. However, they are extremely computationally intensive and challenging to train at scale. Recently, Cerebras published a family of open-source LLMs (Cerebras-GPT) trained on its novel Wafer-Scale Engine and the Andromeda supercomputer — the first such models trained on an AI hardware accelerator. This talk will provide an overview of various aspects of the solution that enabled this work: the Wafer-Scale Engine chip architecture, the Weight Streaming execution model, and the Andromeda Wafer-Scale Cluster.

Speaker Bio:

Kevin E. Murray is a Senior Member of Technical Staff at Cerebras Systems in Toronto. He received his PhD in Electrical and Computer Engineering from the University of Toronto. He was previously the lead developer of the Verilog to Routing (VTR) project, a visiting Research Assistant at Imperial College London, and has worked on digital design flows at Advanced Micro Devices (AMD). His research interests include Computer Aided Design (CAD) algorithms, compilers, and architectures for Machine Learning accelerators and FPG As.

This talk is part of the CAS Talks series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.

 

Changes to Talks@imperial | Privacy and Publicity