Abstract: The talk will cover the basics of accelerating both inference and training for LLM transformer architectures, using the combination of a compiler and a systolic compute array.
Bio: Daniel Kroening is a Senior Principal Applied Scientist at Amazon, where he works on the correctness of the Neuron Compiler for distributed training and inference. Prior to joining Amazon, he worked as a Professor of Computer Science at the University of Oxford and is the co-founder of Diffblue Ltd., a University spinout that develops AI that targets code and code-like artefacts.
He has received the Semiconductor Research Corporation (SRC) Inventor Recognition Award, an IBM Faculty Award, a Microsoft Research SEIF Award, and the Wolfson Research Merit Award. He serves on the CAV steering committee and was co-chair of FLOC 2018, EiC of Springer FMSD, and is co-author of the textbooks on Decision Procedures and Model Checking.
Host: Charith Mendis