Training Large-scale Foundation Models on Emerging AI Accelerators : vTools Events

IEEE.org | IEEE Xplore Digital Library | IEEE Standards | IEEE Spectrum | More Sites

Training Large-scale Foundation Models on Emerging AI Accelerators

#STEM #GPT #Lehigh #CS

Foundation models such as GPT-4 have garnered significant interest from both academia and industry. An outstanding feature of such models is so-called emergent capabilities, including multi-step reasoning, instruction following, and model calibration, in a wide range of application domains. Such capabilities were previously only attainable with specially designed ML models, such as those using carefully constructed knowledge graphs, in specific domains. As the capabilities of foundation models have increased, so too have their sizes at a rate much faster than Moore's law. The training of foundation models requires massive computing power. For instance, training a BERT model on a single state-of-the-art GPU machine with multi-A100 chips can take several days, while training GPT-3 models on a large multi-instance GPU cluster can take several months to complete the estimated 3*10^23 flops.

This talk provides an overview of the latest progress in supporting foundation model training and inference with new AI accelerators. It reviews progress on the modeling side, with an emphasis on the transformer architecture, and presents the system architecture supporting training and serving foundation models.

Explore the frontier of AI with us as we delve into the power and potential of foundation models like GPT-4. Discover how emergent capabilities are pushing the boundaries of what's possible and the groundbreaking AI accelerators making it all happen. Join our talk to uncover the future of AI training and application!

Date and Time

Location

Hosts

Registration

Add Event to Calendar
iCal
Google Calendar

If you are not a robot, please complete the ReCAPTCHA to display virtual attendance info.

Contact Event Host

Starts 08 November 2023 12:30 PM UTC
Ends 01 December 2023 12:00 AM UTC
No Admission Charge

Speakers

Jun (Luke) Huan of Amazon AWS AI Labs

Topic:

Training Large-scale Foundation Models on Emerging AI Accelerators

Biography:

Dr. Jun (Luke) Huan is a Principal Scientist at AWS AI Labs. Dr. Huan works on AI and Data Science. He has published more than 160 peer-reviewed papers in leading conferences and journals and has graduated eleven Ph.D. students. He was a recipient of the NSF Faculty Early Career Development Award in 2009. His group won several best paper awards from leading international conferences. Before joining AWS, he worked at Baidu Research as a distinguished scientist and the head of Baidu Big Data Laboratory. He founded StylingAI Inc., an AI start-up, and worked as the CEO and Chief Scientist in 2019-2021. Before joining the industry, he was the Charles E. and Mary Jane Spahr Professor in the EECS Department at the University of Kansas. From 2015-2018, Dr. Huan worked as a program director at the US NSF in charge of its big data program.

Address:United States