Dimensions of Factual Reliability in LLMs: Multilinguality and the Short–Long Form Gap

#llm #ai# #professional #student #croatia #STEM
Share

As Large Language Models (LLMs) become globally deployed for information-seeking tasks,
ensuring their factual reliability across diverse contexts has become paramount. Yet most
research on LLM hallucinations remains narrowly focused—either English-centric or limited to
controlled tasks like summarization and translation. This talk presents a comprehensive
investigation into factual alignment across two critical dimensions: languages and task formats.
First, we will talk about a large-scale study evaluating hallucination across 30 languages in
open-domain question answering, revealing surprising patterns in how factual accuracy varies
across linguistic and scaling contexts. Second, I will explore the factual alignment gap between
short and long-form responses, demonstrating how the same model can exhibit vastly different
reliability depending on response format. Together, these works provide a unified perspective on
LLM factuality "in the wild," offering diagnostic tools and insights for practitioners deploying
LLMs internationally and researchers working to build more trustworthy AI systems.


  Date and Time

  Location

  Hosts

  Registration



  • Add_To_Calendar_icon Add Event to Calendar
  • Unska 3
  • Zagreb, Grad Zagreb
  • Croatia 10000

  • Contact Event Host


  Speakers

Saad

Topic:

Dimensions of Factual Reliability in LLMs: Multilinguality and the Short–Long Form Gap

As Large Language Models (LLMs) become globally deployed for information-seeking tasks,
ensuring their factual reliability across diverse contexts has become paramount. Yet most
research on LLM hallucinations remains narrowly focused—either English-centric or limited to
controlled tasks like summarization and translation. This talk presents a comprehensive
investigation into factual alignment across two critical dimensions: languages and task formats.
First, we will talk about a large-scale study evaluating hallucination across 30 languages in
open-domain question answering, revealing surprising patterns in how factual accuracy varies
across linguistic and scaling contexts. Second, I will explore the factual alignment gap between
short and long-form responses, demonstrating how the same model can exhibit vastly different
reliability depending on response format. Together, these works provide a unified perspective on
LLM factuality "in the wild," offering diagnostic tools and insights for practitioners deploying
LLMs internationally and researchers working to build more trustworthy AI systems.

Biography:

Saad is a second-year PhD student at the Center for AI and Data Science at the University of
Würzburg, where he is affiliated with the Equitably Fair and Trustworthy Language Technology
(EQUIFAIR) project. Before starting his PhD, he worked as a research engineer on the
NORFACE-funded EUINACTION project. Prior to this, he studied computer science and
computational linguistics as an Erasmus Mundus scholar at Charles University and Saarland
University.

Address:Germany