Similarity-based Methods for Language Model Analysis and Prediction

Date:

Tuesday, 18 March, 2025 - 13:00 to 14:00

Speaker:

Julius Cheng (University of Cambridge)

Venue:

Lecture Theatre 2, Computer Laboratory, William Gates Building

In natural language, there are usually many ways to say the same thing: the answer to a question can be said multiple ways, and there are many good translations of the same sentence. As a result, language models (LMs) trained on large corpora often spread probability mass across a vast number of generations, containing mostly minor variations. This raises problems for LM applications; for prediction, probability is loosely correlated with quality, so various heuristics must be added to beam search to achieve adequate results. For uncertainty quantification, commonly used measures like Shannon entropy can overestimate uncertainty when probability is spread across functionally equivalent texts. In this talk, I will present my PhD thesis work which addresses these shortcomings using methods which incorporate measurements of semantic similarity. In prediction, returning a "protoypical" prediction according to semantic similarity outperforms high probability predictions. In uncertainty quantification, generalizing the classic Shannon entropy with semantic similarity leads to a more trustworthy measure. Lastly, we apply Bayesian optimization to translation reranking, which uses kernel similarity to efficiently search for high quality translations.

"You can also join us on Zoom":https://cam-ac-uk.zoom.us/j/83400335522?pwd=LkjYvMOvVpMbabOV1MVTm8QU6DrGN7.1

Seminar series:

Artificial Intelligence Research Group Talks

View on talks.cam

Calendar

Upcoming seminars

15Jul

"Please Verify": How Human Behavior Undermines Blockchain Security

Taro Tsuchiya, Carnegie Mellon University

Security Seminar
17Jul

SolarFit: A Successor Refinement Approach for Sizing of PV and Storage Systems in EV-Enabled Homes

Julia Gschwind ETH Zurich, University of Cambridge

Energy and Environment Group
23Jul

Google DeepMind’s Gemini models and the Rise of Long-Context LLMs

Dr Nikolay Savinov (Google DeepMind)

Foundation AI
30Jul

Title to be confirmed

Stephen Xia, Northwestern University

Centre for Mobile, Wearable Systems and Augmented Intelligence Seminar Series
04Aug

Learning Under Constraints: From Federated Collaboration to Black-Box LLMs

Salma Kharrat, Kaust

Cambridge ML Systems Seminar Series

View all seminars

Upcoming seminars

About the department

Social media

Study at Cambridge

About the University

Research at Cambridge