A selective and biased choice of techniques for building a distributed data store

Date:

Tuesday, 14 May, 2024 - 12:00 to 13:00

Speaker:

Paweł T. Wojciechowski, Poznan University of Technology

Venue:

SS03, Computer Laboratory

Single-machine data stores cannot support the scale and ubiquity of data
today. The Internet applications and services must process a huge number
of concurrent requests and events per second. So, they use distributed (or
replicated) data stores which store and process data on multiple machines,
offering key advantages in performance, scalability, and reliability. The
purpose of the talk is to present a selective and biased choice of
techniques and results which can be used for building an efficient
distributed data store. Biased, because I only present solutions and
results developed within a research project that I did with my PhD
students. Selective, because an exhaustive description would be too
exhausting to fit into a single talk. Therefore I will be discussing just
the design of our novel database index for key-value data store systems,
and only skim our other contributions that are directly related to
distributed systems. The index, called Jiffy, has been designed with
performance and scalability in mind. Therefore it has been designed as a
lock-free concurrent data structure, which can dynamically adapt to the
changing workload. It achieves superior performance despite built-in
atomic operations (batch updates, snapshots, and range scans). During the
talk I will be presenting Jiffy's architecture, the algorithms for
inserting and looking up the key-value pairs, and the operations used for
resizing the data structure dynamically. The other contributions of our
project include: efficient support for replica state recovery after
failures, either by extending the classic Paxos consensus algorithm, or
through the use of persistent memory, and a bit surprising theoretical
results which are applicable to distributed data store systems that
compromise consistency in favour of high availability and speed, but also
support operations ensuring strong consistency (which requires consensus
among replicas). (Based on a keynote talk at DEBS '23)

Seminar series:

Logic and Semantics Seminar

View on talks.cam

Calendar

Upcoming seminars

15Jul

"Please Verify": How Human Behavior Undermines Blockchain Security

Taro Tsuchiya, Carnegie Mellon University

Security Seminar
17Jul

SolarFit: A Successor Refinement Approach for Sizing of PV and Storage Systems in EV-Enabled Homes

Julia Gschwind ETH Zurich, University of Cambridge

Energy and Environment Group
23Jul

Google DeepMind’s Gemini models and the Rise of Long-Context LLMs

Dr Nikolay Savinov (Google DeepMind)

Foundation AI
30Jul

Title to be confirmed

Stephen Xia, Northwestern University

Centre for Mobile, Wearable Systems and Augmented Intelligence Seminar Series
04Aug

Learning Under Constraints: From Federated Collaboration to Black-Box LLMs

Salma Kharrat, Kaust

Cambridge ML Systems Seminar Series

View all seminars

Upcoming seminars

About the department

Social media

Study at Cambridge

About the University

Research at Cambridge