- This event has passed.
Keynote: Powering Llama 3
Title: Powering Llama 3: A Sneak-peek into Meta’s Massive Infrastructure for Generative AI
Abstract:
We trained a model. It used a whole lot of GPUs and a giant network. We wrote a paper about it. People thought it was cool. I’ll talk about that. Oh, yeah, we open-sourced the model too!
And here I asked Llama 3 to write the abstract for me —
Llama 3 is the latest edition of Meta’s Generative AI models, and it’s a game-changer. It’s the first openly available model that rivals the top AI models in terms of state-of-the-art capabilities in general knowledge, steerability, math, tool use, and multilingual translation. But what makes it tick? In this talk, I’ll dive into the infrastructure behind Llama 3 and explore the challenges around the scale of this infrastructure, particularly those related to its compute, network, storage, and software ecosystem.
Speaker Biography:
Pavan Balaji is a Principal Research Scientist at Meta AI, where he serves as the technical lead for two areas: (1) GPU training systems (architectural design, performance analysis); and (2) AI communication libraries for our various hardware systems (GPUs, Meta internal silicon). Dr. Balaji helped build some of Meta’s largest AI supercomputing systems, such as the recent Grand Teton architecture, that power Meta’s internal AI workloads, including recommendation and ranking models and Generative AI models such as Llama.
Before joining Meta, Dr. Balaji held appointments as a Senior computer Scientist and Group Lead at the Argonne National Laboratory and as an Institute Fellow of the Northwestern-Argonne Institute of Science and Engineering at Northwestern University. He contributed to the design and software implementation of a number of projects on communication runtime systems (MPI, UCX), threading models (lightweight threads such as Argobots, OpenMP), and heterogeneous memory systems. Particularly noteworthy are the MPICH project (used by thousands of supercomputers around the world, including the three US Exascale supercomputers — Aurora, Frontier, and El Capitan), the UCX project (R&D100 award winner in 2019), and the Argobots project (R&D100 award finalist in 2020, and a driving piece of software for numerous supercomputers and commercial products such as Intel DAOS).
Dr. Balaji has held several other leadership roles in the community serving on the board of directors or advisory board for numerous domestic and International projects, including UCX (US), Cilkplus (US), EPEEC (Europe), and Exascale Technologies (China). He has also served on the organizing committee for numerous high-profile conferences and journals including IEEE/ACM SC (technical program chair), IEEE Cluster (general co-chair), IEEE/ACM CCGrid (general co-chair, program chair), and IEEE TPDS (associate editor-in-chief).