skip to content

Department of Computer Science and Technology

Date: 
Wednesday, 18 June, 2025 - 15:00 to 16:00
Speaker: 
Max Ryabinin
Venue: 
Computer Lab, LT1

Recently, the field of Machine Learning has seen renewed interest in communication-efficient training over slow, unreliable, and heterogeneous networks. While the latest results and their applications to LLMs are highly promising, their underlying ideas have surprisingly many connections to well-established approaches to distributed ML. In this talk, I will provide an overview of recent developments in decentralized training within the broader context of areas such as volunteer computing, communication-efficient optimization, and federated learning. In addition, I will present our research in this field, ranging from Learning@home/DMoE to Petals, and share some lessons learned about ML research in general during the development of these methods.

Max Ryabinin is VP of Research & Development at Together AI, working on large-scale deep learning. Previously, he was a Senior Research Scientist at Yandex, studying a wide range of topics in natural language processing and efficient machine learning. During his PhD, he developed methods for distributed training and inference over slow and unstable networks, such as DeDLOC, SWARM Parallelism, and Petals. He is also the creator and maintainer of Hivemind, a highly popular open-source framework for decentralized training in PyTorch.

Seminar series: 
Cambridge ML Systems Seminar Series

Upcoming seminars