skip to content

Department of Computer Science and Technology

Date: 
Tuesday, 16 July, 2024 - 15:00 to 16:00
Speaker: 
Peter Kairouz -- Google
Venue: 
West 2, West Hub (https://www.westcambridgehub.uk/visit)

The emergence of large language models (LLMs) presents significant opportunities in content generation, question answering, and information retrieval. Nonetheless, training, fine-tuning, and deploying these models entails privacy risks. This talk will address these risks, outlining privacy principles inspired by known LLM vulnerabilities when handling user data. We demonstrate how techniques like federated learning and user-level differential privacy (DP) can systematically mitigate many of these risks at the cost of increased computation. In scenarios where only moderate-to-weak user-level DP is achievable, we propose a strong (task-and-model-agnostic) membership inference attack that allows us to quantify risk by estimating the actual privacy leakage (empirical epsilon) accurately in a single training run. The talk will conclude with a few projections and compelling research directions.

Seminar series: 
Cambridge ML Systems Seminar Series

Upcoming seminars