R&D workshop program 2025

Lectures on Language Technology

Talks held in connection to Ahmed Ruby's defense, on April 27.

Location: Engelska parken: 22-0031

Time Name Title and abstract

09:30 - 10:00 Philippe Muller Cross-Lingual Pragmatic Competence in Language Models
Current computational model have demonstrated impressive capabilities to capture various levels of linguistic knowledge, but they seem to struggle with complex input and larger documents. Pragmatics is the domain of linguistic that focuses on contextual understanding and structural aspects beyond the sentence level in texts or simple utterances in conversations. I will have a look at recent work that tries to unify approaches to the structural information carried by so-called discourse relations, across languages and formalisms. I will also present ongoing work on trying to understand if and how language models show competence on representing these relations.

10:00 - 10:30 Christian Hardmeier Towards anthropomimetic uncertainty: Uncertainty quantification and communication in large language models
To appear trustworthy, one must be aware of the limits of one’s knowledge and convey one’s confidence effectively to interlocutors. In this talk, I present some of our research on quantifying and communicating confidence in LLMs. On the statistical side, this includes an efficient method to estimate epistemic and aleatoric uncertainty of LLM predictions based on gradient norms. On the communication side, I introduce our ongoing work on calibrating LLM uncertainty to the user’s understanding of epistemic expressions. This work pursues a broader goal of implementing more humanlike patterns of expressing uncertainty in LLM communication.

10:30 - 11:00 Coffee Break

11:00 - 11:30 Jonas Beskow Current efforts in Swedish Sign Language AI
This talk describes our ongoing research in Sign Language, including work on representation learning from video data as well as a new 4 hour motion capture dataset for Swedish Sign language for generative modelling.

11:30 - 12:00 Meriem Beloucif The Human Labeling Bottleneck: Are Adapted LLMs the Way Out?
Adapting large language models to user-specific preferences is often constrained by the cost of human annotation, making preference optimisation impractical in low-resource settings where preferences cannot be reliably labelled by LLMs themselves, e.g., due to cultural, subjective, or personalised contexts. I will present ways for investigating how language models encode preference information in their intermediate representations, finding that activations from chosen and rejected responses form distinct clusters across layers, even in pretrained models.

Time	Name	Title and abstract
09:30 - 10:00	Philippe Muller	Cross-Lingual Pragmatic Competence in Language Models Current computational model have demonstrated impressive capabilities to capture various levels of linguistic knowledge, but they seem to struggle with complex input and larger documents. Pragmatics is the domain of linguistic that focuses on contextual understanding and structural aspects beyond the sentence level in texts or simple utterances in conversations. I will have a look at recent work that tries to unify approaches to the structural information carried by so-called discourse relations, across languages and formalisms. I will also present ongoing work on trying to understand if and how language models show competence on representing these relations.
10:00 - 10:30	Christian Hardmeier	Towards anthropomimetic uncertainty: Uncertainty quantification and communication in large language models To appear trustworthy, one must be aware of the limits of one’s knowledge and convey one’s confidence effectively to interlocutors. In this talk, I present some of our research on quantifying and communicating confidence in LLMs. On the statistical side, this includes an efficient method to estimate epistemic and aleatoric uncertainty of LLM predictions based on gradient norms. On the communication side, I introduce our ongoing work on calibrating LLM uncertainty to the user’s understanding of epistemic expressions. This work pursues a broader goal of implementing more humanlike patterns of expressing uncertainty in LLM communication.
10:30 - 11:00	Coffee Break
11:00 - 11:30	Jonas Beskow	Current efforts in Swedish Sign Language AI This talk describes our ongoing research in Sign Language, including work on representation learning from video data as well as a new 4 hour motion capture dataset for Swedish Sign language for generative modelling.
11:30 - 12:00	Meriem Beloucif	The Human Labeling Bottleneck: Are Adapted LLMs the Way Out? Adapting large language models to user-specific preferences is often constrained by the cost of human annotation, making preference optimisation impractical in low-resource settings where preferences cannot be reliably labelled by LLMs themselves, e.g., due to cultural, subjective, or personalised contexts. I will present ways for investigating how language models encode preference information in their intermediate representations, finding that activations from chosen and rejected responses form distinct clusters across layers, even in pretrained models.

Ahmed Ruby's defense of his thesis Modeling Implicit Discourse Relations Across Modalities and Languages will take place at 14.00 in Humanistiska teatern.