Why do small language models underperform?
IEEE ComSoc Norther Virginia chapter and GMU Department of Computer Science invites you to attend the following Distinguished Lecture:
Title: Why do small language models underperform?
Speaker: Benoît Sagot, Director of Research at INRIA
Date: May 2, 2024
Time: 11:00am – 12:00pm
In person Location: GMU Fairfax campus, Nguyen Engineering Bldg., Conference Room 4201
Abstract:
Language models, and in particular generative and conversational language models, are at the heart of recent advances in natural language processing (NLP). Understanding how these models represent textual content and how they learn these representations still raises multiple research questions. In this talk, I will start from an observation that small models are less efficient than expected. I will show that language models relying on the Transformer architecture tend to produce vector representations that are not isotropically distributed in space. This anisotropy is linked to the way in which these models are learned, which leads to the frequency of the tokens taking a preponderant place in their representation. I will show that this effect has negative consequences on the ability of small models to train satisfactorily (“performance saturation”) but does not seem to affect larger models. I will then describe a new approach for training language models intended to avoid the undesirable effects of this prevalence of frequency information. The resulting “headless” models display a number of advantages over standard models, including on downstream performance.
Bio:
Benoît Sagot is a computer scientist specialized in natural language processing (NLP). He is a Senior Researcher (Directeur de Recherches) at INRIA, where is heads the INRIA research project ALMAnaCH in Paris, France. He also holds a chair in the PRAIRIE institute dedicated to artificial intelligence, and currently holds the annual chair for computer science in the Collège de France. His research focuses on language modelling, machine translation, language resource development and computational linguistics, with a focus on French in all its form and on less-resourced languages.
Date and Time
Location
Hosts
Registration
- Date: 02 May 2024
- Time: 03:00 PM UTC to 04:00 PM UTC
-
Add Event to Calendar
- 4400 University Drive
- Fairfax, Virginia
- United States 22030
- Building: Nguyen Engineering Bldg., Conference Room 4201
- Contact Event Host
- Co-sponsored by GMU Department of Computer Science