Dimensions of Factual Reliability in LLMs: Multilinguality and the Short–Long Form Gap : vTools Events | vTools IEEE

IEEE.org | IEEE Xplore Digital Library | IEEE Standards | IEEE Spectrum | More Sites

Dimensions of Factual Reliability in LLMs: Multilinguality and the Short–Long Form Gap

#llm #ai# #professional #student #croatia #STEM

As Large Language Models (LLMs) become globally deployed for information-seeking tasks,

ensuring their factual reliability across diverse contexts has become paramount. Yet most

research on LLM hallucinations remains narrowly focused—either English-centric or limited to

controlled tasks like summarization and translation. This talk presents a comprehensive

investigation into factual alignment across two critical dimensions: languages and task formats.

First, we will talk about a large-scale study evaluating hallucination across 30 languages in

open-domain question answering, revealing surprising patterns in how factual accuracy varies

across linguistic and scaling contexts. Second, I will explore the factual alignment gap between

short and long-form responses, demonstrating how the same model can exhibit vastly different

reliability depending on response format. Together, these works provide a unified perspective on

LLM factuality "in the wild," offering diagnostic tools and insights for practitioners deploying

LLMs internationally and researchers working to build more trustworthy AI systems.

Date and Time

Location

Hosts

Registration

Add Event to Calendar
iCal
Google Calendar

Unska 3
Zagreb, Grad Zagreb
Croatia 10000

Contact Event Host

Speakers

Saad

Topic:

Dimensions of Factual Reliability in LLMs: Multilinguality and the Short–Long Form Gap

As Large Language Models (LLMs) become globally deployed for information-seeking tasks,

ensuring their factual reliability across diverse contexts has become paramount. Yet most

research on LLM hallucinations remains narrowly focused—either English-centric or limited to

controlled tasks like summarization and translation. This talk presents a comprehensive

investigation into factual alignment across two critical dimensions: languages and task formats.

First, we will talk about a large-scale study evaluating hallucination across 30 languages in

open-domain question answering, revealing surprising patterns in how factual accuracy varies

across linguistic and scaling contexts. Second, I will explore the factual alignment gap between

short and long-form responses, demonstrating how the same model can exhibit vastly different

reliability depending on response format. Together, these works provide a unified perspective on

LLM factuality "in the wild," offering diagnostic tools and insights for practitioners deploying

LLMs internationally and researchers working to build more trustworthy AI systems.

Biography:

Saad is a second-year PhD student at the Center for AI and Data Science at the University of

Würzburg, where he is affiliated with the Equitably Fair and Trustworthy Language Technology

(EQUIFAIR) project. Before starting his PhD, he worked as a research engineer on the

NORFACE-funded EUINACTION project. Prior to this, he studied computer science and

computational linguistics as an Erasmus Mundus scholar at Charles University and Saarland

University.

Address:Germany