Dear NLPers and friends,
In our next NLP group meeting on June 12, at 12-13 in M20, there will be an invited talk by Maria Skeppstedt and Magnus Ahltorp.
The agenda for the NLP meeting is:
1. Round-of-table news (accepted papers, conference trips, grant applications etc.)
2. Invited talk by Maria Skeppstedt (Research Engineer, Uppsala University) and Magnus Ahltorp (Computational Linguist, Nakajima Koen Research Institute)
Multilingual Text Visualisation with Word Rain and Topic Timelines
===================================================
The text visualisation methods Word Rain and Topic Timelines aim to provide the user with an overview of large text collections. We will describe the theory behind the methods, and why we developed them. We will also provide examples of how they are used for visualising and exploring corpora belonging to different genres, e.g. parliamentary corpora, patient organisation periodicals and texts about climate change.
===================================================
We also plan to arrange some kind of social event in the evening - more details to come!
Bring your lunch and we will arrange some “fika”. Welcome!
Aron & Hercules
——
Aron Henriksson
Associate Professor (Docent)
Department of Computer and Systems Sciences (DSV)
Stockholm University
P.O. Box 1073, SE-164 25 Kista, Sweden
Visiting address: Borgarfjordsgatan 12, Kista
Phone: +46-8-164985
Welcome to the half time seminar of Korbinian Randl with the title "Towards Trustworthy Classification with Large Language Models".
External reviewer: Professor Steffen Egger, University of Technology Nuremberg (UTN), Germany.
Internal reviewer: Dr. Peter Idestam-Almquist, DSV, Stockholm University
Main supervisor: Associate professor Tony Lindgren, DSV, Stockholm University
Supervisors:
Associate professor Aron Henriksson, DSV, Stockholm University
Assistant professor John Pavlopoulos, Department of Informatics, Athens University of Economics and Business, Greece
Time: Tuesday 10th of June, 13:00-16:00 (CET)
Place: L30, NOD-huset, DSV/Stockholms universitet, Borgarfjordsgatan 8, Kista.
Zoom: https://stockholmuniversity.zoom.us/j/65276614265?from=addon
Abstract
Large Language Models (LLMs) have become central to modern digital life, underpinning applications such as conversational AI, content generation, and software debugging. At the core of these systems lie transformer-based architectures, which excel at modeling context and semantics. This makes them strong candidates for the future of text classification. However, despite their capabilities, LLMs remain largely opaque "black boxes" with limited explainability. They are also prone to hallucinations - the generation of plausible-sounding but factually incorrect outputs - particularly when faced with input scenarios not encountered during training (Perkovi´c et al., 2024; Reddy et al., 2024). This unreliability poses serious challenges in safety-critical domains such as healthcare and food regulation.
This dissertation half-time report addresses the challenge of untrustworthy LLM behavior in classification settings by pursuing four core objectives: (a) First, it focuses on the development and refinement of local explainability methods that can shed light on individual LLM decisions and help make their behavior more interpretable. Such systems could, for example, uncover that an LLM relies on some spurious correlation in the data rather than the actual, causally linked, information for its classification. This objective is addressed in PAPER II and PAPER IV which specifically evaluate the usefulness of LLM generated self-explanations and find that counterfactual self-explanations can be a fast, valid, and plausible candidate. (b) Second, it evaluates these methods not only from the perspective of end-user understanding but also as diagnostic tools to identify flaws in data, model training, or architecture-thereby enabling trustworthiness-by-design. For example, knowing that an LLM relies on spurious correlations for its classification, one can curate the data used for fine-tuning and eliminate such correlations.
While the improvement of the model will need to be addressed in the second half of the PhD, the thesis explores using explanations for diagnostic purposes in PAPER V. (c) Third, the work shifts away from post-hoc explanations toward inherently interpretable prompting backends that guide LLM behavior during classification. As retraining an entire LLM is infeasible for most ML practitioners due to the enormous data and hardware requirements, such backends
are a more accessible and actionable part of prompting pipelines. Furthermore, techniques like Conformal Prediction (Vovk et al., 2005, CP) or Retrieval Augmented Generation (Lewis et al., 2020, RAG) can be used to guide LLM content generation and reduce hallucinations. PAPER I started exploring such backend methods based on CP, but future work will also target the application of RAG for such purposes. (d) Finally, the research supports its empirical contributions
through the curation of a publicly available, privacy-compliant dataset that enables reproducible experimentation (PAPER I, PAPER III). Together, these objectives contribute toward safer, more transparent, and more trustworthy LLM deployment in sensitive contexts.
Best regards,
Tony Lindgren
Ph. D., Docent, Head of the Systems Analysis and Security Unit
Department of Computer and Systems Sciences
Stockholm University
Postbox 7003, 164 07 Kista, Sweden
Visiting address: Borgarfjordsgatan 12, Kista
Phone: +46-8-16 17 01,
Mobile: +46-70-190 68 28
http://dsv.su.se<http://dsv.su.se/>
Tony Lindgren is inviting you to a scheduled Zoom meeting.
Join Zoom Meeting
https://stockholmuniversity.zoom.us/j/65276614265?from=addon
Meeting ID: 652 7661 4265
---
One tap mobile
+46850163827,,65276614265# Sweden
+46850500828,,65276614265# Sweden
---
Dial by your location
* +46 8 5016 3827 Sweden
* +46 8 5050 0828 Sweden
* +46 8 5050 0829 Sweden
* +46 8 5052 0017 Sweden
* +46 850 539 728 Sweden
* +46 8 4468 2488 Sweden
Meeting ID: 652 7661 4265
Find your local number: https://stockholmuniversity.zoom.us/u/cebEmSJCeE
---
Join by SIP
65276614265(a)109.105.112.236<mailto:65276614265@109.105.112.236>
* 65276614265(a)109.105.112.235<mailto:65276614265@109.105.112.235>
---
Join by H.323
* 109.105.112.236
* 109.105.112.235
Meeting ID: 652 7661 4265
Dear NLP group
Predoc Seminar Thomas Vakili Wednesday June 4, 2025 13.00-15.00 in room L30
with the title "Preserving the Privacy of Clinical Language Models"
External reviewer: Elena Volodina, Gothenburg University
Main supervisor: Hercules Dalianis, DSV
Supervisor: Aron Henriksson, DSV
Professor closest to the subject: Panagiotis Papapetrou, DSV
Read Abstract
https://internt.dsv.su.se/sv/node/1833
Warm welcome
Hercules
_________________________________________________________________________
Dr. Hercules Dalianis, Professor
Department of Computer and Systems Sciences
ph: +46 8 16 16 16 DSV/Stockholm University
mobile ph: +46 70 568 13 59 P.O. Box 7003, 164 07 Kista
email: hercules(a)dsv.su.se Stockholm, Sweden
www: http://www.dsv.su.se/hercules/
_________________________________________________________________________