Employing Psycholinguistics to Understand Decoding in Probabilistic Language Generators

Date:

Friday, 9 February, 2024 - 12:00 to 13:00

Speaker:

Clara Meister, ETH Zurich

Venue:

https://cam-ac-uk.zoom.us/j/86071371348?pwd=OVlqdDhZNHlGbzV5RUZrSzM1cUlhUT09

Standard probabilistic language generators often fall short when it comes to producing coherent and fluent text despite the fact that the underlying models perform well under standard metrics, e.g., perplexity. This discrepancy has puzzled the language generation community for the last few years. In this talk, we’ll take a different approach to looking at generation from probabilistic models, pulling on concepts from psycholinguistics and information theory in the attempt to provide insights into some observed behaviors, e.g., why high-probability texts can be dull or repetitive. Humans use language as a means of communicating information, aiming to do so in a simultaneously efficient and error-minimizing manner; in fact, psycholinguistics research suggests humans choose each word in a string with this subconscious goal in mind. We propose that decoding from probabilistic models of language should attempt to mimic these behaviors. To motivate this notion, we’ll look at common characteristics of several successful decoding strategies, showing how their design allows them to implicitly adhere to attributes of efficient and robust communication. We will then propose a new decoding strategy, with the explicit aim of encoding these human-like properties of natural language usage into generations.

Join Zoom Meeting

https://cam-ac-uk.zoom.us/j/86071371348?pwd=OVlqdDhZNHlGbzV5RUZrSzM1cUlhUT09

Meeting ID: 860 7137 1348

Passcode: 387918

Seminar series:

NLIP Seminar Series

View on talks.cam

Calendar

Upcoming seminars

About the department

Social media

Study at Cambridge

About the University

Research at Cambridge