Feedback Forensics: A Toolkit to Measure AI Personality

Date:

Tuesday, 17 June, 2025 - 13:00 to 14:00

Speaker:

Arduin Findeis (University of Cambridge)

Venue:

Lecture Theatre 2, Computer Laboratory, William Gates Building

Conventional AI benchmarks typically focus on the content of responses, for example checking factual (e.g. MMLU) or mathematical correctness (e.g. GSM8k). However, for many language model applications, the manner (or "personality") of a model's responses also matters to users, for example how friendly or confident responses are. Recent issues with model releases highlight the limited ability of existing evaluation approaches to capture such personality traits: a ChatGPT model version was rolled back over sycophant personality issues, other models' personalities have been critised to overfit to the Chatbot Arena leaderboard.

In this talk, I will introduce Feedback Forensics: our newly released toolkit to measure AI personality traits. Using our toolkit, I will first share results detecting the personality traits currently encouraged by popular human feedback datasets (incl. Chatbot Arena). Next, I will discuss changes and trends in personality traits exhibited across model families and versions. Finally, I will take a closer look the personality differences between the Chatbot Arena and publicly released version of Llama-4-Maverick.

The talk will feature a live demo of our personality visualisation tool and attendees are invited to follow along via our online platform https://feedbackforensics.com/ (laptops are encouraged).

"You can also join us on Zoom":https://cam-ac-uk.zoom.us/j/83400335522?pwd=LkjYvMOvVpMbabOV1MVTm8QU6DrGN7.1

Seminar series:

Artificial Intelligence Research Group Talks

View on talks.cam

Calendar

Upcoming seminars

20Oct

Federated Learning at H.IAAC: On-going Research and Opportunities

Allan M. de Souza & Luiz Bittencourt, Universidade Estadual de Campinas (UNICAMP), Brazil

Cambridge ML Systems Seminar Series
20Oct

Bloomberg: Observability in Action: Designing Effective Dashboards

Speaker to be confirmed

Technical Talks
20Oct

Talk by Professor Bjarne Stroustrup: 'Concept-based Generic Programming'

Bjarne Stroustrup, Professor of Computer Science at Columbia University

Department of Computer Science and Technology talks and seminars
21Oct

AIReg-Bench: Benchmarking Language Models That Assess AI Regulation Compliance

William Marino (University of Cambridge)

Artificial Intelligence Research Group Talks
21Oct

Scalable and Verifiable Carbon Accounting in Supply Chains: Towards an Integrated Framework

Jonathan Heiss (TU Berlin)

Security Seminar

View all seminars

Upcoming seminars

About the department

Social media

Study at Cambridge

About the University

Research at Cambridge