skip to content

Department of Computer Science and Technology

Read more at: Typological Diversity in NLP: What, Why and a Way Forward

Typological Diversity in NLP: What, Why and a Way Forward

Friday, 7 March, 2025 - 12:00 to 13:00

To justify the generalisability of multilingual NLP research, multilingual language technology is frequently evaluated on ‘typologically diverse’ language selections. Yet, what this means often remains vague. In this talk, I first discuss what typological diversity means in NLP, and why it matters. Then, I introduce a...


Read more at: Assessing language-specific capabilities of LLMs: Lessons from Swedish NLP

Assessing language-specific capabilities of LLMs: Lessons from Swedish NLP

Friday, 21 February, 2025 - 11:00 to 12:00

Abstract: In this talk, I discuss benchmarking and interpreting large language models in the context of Swedish. I present a selection of work from my PhD thesis, which analyze LLMs Swedish-specific capabilities in different areas: English-Swedish language transfer, multi-task benchmarking on Swedish NLU and targeted...


Read more at: Formal syntactic theory in the current NLP landscape

Formal syntactic theory in the current NLP landscape

Friday, 14 March, 2025 - 12:00 to 13:00

Natural language processing used to rely on formal methods in its early days, and this included formal theories of syntax where sentence structure was of relevance. In the statistical era, the focus shifted to annotation schemes such as Penn Treebank and Universal Dependencies, which still rely on formal theory in their...


Read more at: Preference Alignment, with Reference Mismatch, and without Reference Models

Preference Alignment, with Reference Mismatch, and without Reference Models

Friday, 31 January, 2025 - 12:00 to 13:00

Abstract: In this talk, I'll cover two recent papers for preference alignment: Odds-Ratio Preference Optimisation (ORPO, EMNLP 2024), discussing the role of the reference model for preference alignment (e.g. DPO, RLHF), and Margin-aware Preference Optimization (under review @ CVPR), thinking about the risks of reference...


Read more at: Natural Language meets Control Theory

Natural Language meets Control Theory

Wednesday, 12 March, 2025 - 16:00 to 17:00

Note this seminar has been rescheduled from its original date and will be taking place at 4 pm. Control theory is fundamental in the design and understanding of many natural and engineered systems, from cars and robots to power networks and bacterial metabolism. It studies dynamical systems—systems whose properties evolve...


Read more at: Metrized Deep Learning: Fast & Scalable Training

Metrized Deep Learning: Fast & Scalable Training

Friday, 14 February, 2025 - 12:00 to 13:00

We build neural networks in a modular and programmatic way using software libraries like PyTorch and JAX. But optimization theory has not caught up to the flexibility of this paradigm, and practical advances in neural net optimization are largely heuristics driven. In this talk we argue that, if we are to treat deep...


Read more at: Scansion-based Lyric Generation

Scansion-based Lyric Generation

Friday, 22 November, 2024 - 12:00 to 13:00

Abstract: Yiwen Chen's study looks at generating lyrics in Mandarin that match well with both the melody and the tonal contour of the language. The approach uses mBART and treats lyrics generation as a sequence-to-sequence (seq2seq) task. Instead of generating lyrics directly from the melody, which is the usual way, the...


Read more at: The Past, Present and Future of Tokenization

The Past, Present and Future of Tokenization

Friday, 29 November, 2024 - 12:00 to 13:00

Abstract: Current large language models (LLMs) predominantly use subword tokenization. They see text as chunks (called "tokens") made up of individual words, or parts of words. This has a number of consequences. For example, LLMs often struggle with seemingly simple tasks involving character-level knowledge, like counting...


Read more at: Linguistics in the Age of Large Language Models.

Linguistics in the Age of Large Language Models.

Friday, 15 November, 2024 - 12:00 to 13:00

Recent chatbots have amazed everyone with their human-like language output. However, their relationship to research in the linguistics is opaque; even their inventors do not fully understand why they are so successful. Further, when probed in depth, some of their outputs are less human-like than first impressions would...


Read more at: 10 Slides on Human Feedback

10 Slides on Human Feedback

Friday, 8 November, 2024 - 12:00 to 13:00

In this talk, Max Bartolo will share a brief overview of the critical role human feedback plays in enhancing Large Language Model (LLM) performance and aligning model behaviours to human expectations. We will delve into key aspects of human feedback, examining some of its requirements, benefits, and challenges. We will...