BRAIN-INSPIRED LOW-POWER LANGUAGE MODEL
This talk unveils the transformative potential of achieving sub-10-watt language models (LMs) by drawing inspiration from the brain’s energy efficiency. We introduce a groundbreaking approach to language model design, featuring a matrix-multiplication-free architecture that scales to billions of parameters. To validate this paradigm, we developed custom hardware solutions (FPGA) as well as leveraged pre-existing neuromorphic hardware (Intel Loihi 2), optimized for lightweight operations that outperform traditional GPU capabilities. Our system achieves human-surpassing throughput on billion-parameter models at just 13 watts, setting a new benchmark for energy-efficient AI. This work not only redefines what's possible for low-power LLMs but also highlights the critical operations future accelerators must prioritize to enable the next wave of sustainable AI innovation.
Date and Time
Location
Hosts
Registration
- Date: 06 May 2025
- Time: 11:00 PM UTC to 12:00 AM UTC
-
Add Event to Calendar
Speakers
Dr. Eshraghian
BRAIN-INSPIRED LOW-POWER LANGUAGE MODEL
This talk unveils the transformative potential of achieving sub-10-watt language models (LMs) by drawing inspiration from the brain’s energy efficiency. We introduce a groundbreaking approach to language model design, featuring a matrix-multiplication-free architecture that scales to billions of parameters. To validate this paradigm, we developed custom hardware solutions (FPGA) as well as leveraged pre-existing neuromorphic hardware (Intel Loihi 2), optimized for lightweight operations that outperform traditional GPU capabilities. Our system achieves human-surpassing throughput on billion-parameter models at just 13 watts, setting a new benchmark for energy-efficient AI. This work not only redefines what's possible for low-power LLMs but also highlights the critical operations future accelerators must prioritize to enable the next wave of sustainable AI innovation.
Biography:
Jason Eshraghian is an Assistant Professor in the Department of Electrical and Computer Engineering at the University of California, Santa Cruz. He holds dual degrees in Electrical and Electronic Engineering and Law from The University of Western Australia (2016) and earned his Ph.D. in 2019 from the same institution. From 2019 to 2022, he served as a Fulbright Research Fellow at the University of Michigan.
His research focuses on neuromorphic computing and brain-inspired machine learning, and has been recognized with seven IEEE Best Paper and Live Demonstration Awards. He is the developer of snnTorch, a Python library with over 200,000 downloads for training spiking neural networks.
He has served as the Area Chair of the Telluride Neuromorphic Cognition and Engineering Workshop (2023, 2024) and is a co-organizer of the NeuroAI workshop at NeurIPS 2024. He is an Associate Editor of APL Machine Learning, the Secretary of the IEEE Neural Systems and Applications Technical Committee, and a Scientific Advisory Board Member of BrainChip and Conscium.
Email:
Address:Electrical and Computer Engineering, Baskin School of Engineering, UC Santa Cruz, Santa Cruz, United States