Text-and-audio methods

Date:

Tuesday, 30 January, 2024 - 13:00 to 14:00

Speaker:

Catalina Cangea (Google DeepMind)

Venue:

Lecture Theatre 2, Computer Laboratory, William Gates Building

This talk supports the R255 Advanced Topics in Machine Learning course module on Multimodal Learning and provides a bird’s eye view of the rapidly evolving text-audio landscape, with a focus on music as a primary example of audio data. I will first present types of tasks that exist in this space, then discuss data curation challenges and follow with an overview of some existing retrieval and generation methods, including a quick primer on diffusion models. Finally, I will describe current evaluation metrics and their limitations.

"You can also join us on Zoom":https://cam-ac-uk.zoom.us/j/92041617729

Seminar series:

Artificial Intelligence Research Group Talks

View on talks.cam

Calendar

Upcoming seminars

31Jul

Perceptually-Inspired Algorithms for Power Optimization in XR Displays

Kenny Chen, New York University

Rainbow Group Seminars
31Jul

Perceptually-Inspired Algorithms for Power Optimization in XR Displays

Kenny Chen, New York University

Rainbow Group Seminars
04Aug

Learning Under Constraints: From Federated Collaboration to Black-Box LLMs

Salma Kharrat, Kaust

Cambridge ML Systems Seminar Series
04Aug

Trio of talks: actionable security and privacy, security and privacy perceptions in South Asia, and reproductive security and privacy on TikTok in the post-Roe era

Anna Lena Rotthaler (Paderborn University), Deepthi Munagara (Paderborn University), and Rachel Rodriguez Gonzalez (Paderborn University and The George Washington University)

Security Seminar
24Aug

Title to be confirmed

Speaker to be confirmed

Foundation AI

View all seminars

Upcoming seminars

About the department

Social media

Study at Cambridge

About the University

Research at Cambridge