skip to content

Department of Computer Science and Technology

 

Agent of Things

Client: Matthew Postgate, Infometa

Many people are concerned that their phone provider (whether Apple, Samsung or Google) effectively controls their whole life, tying them in to an ecosystem of other products and services. This is very different from an earlier age, when the Internet of Things was expected to provide “information appliances” that offered self-contained functionality rather than surveillance capitalism. Your task is to prototype a privacy-preserving digital twin architecture that allows customers to interface in an intelligent way with specific devices (examples in Cambridge might be a smart bicycle with embedded GoPro, or the door lock of your college room), while strictly constraining the way this functionality gets connected to other parts of their digital life.

Atmospheric Metaverse

Client: Ian Lewis

Lecture Theatre 1 in the Gates Building is packed with environmental sensors that collect all sorts of historical information about air quality. Imagine if the Metaverse gave you superpowers to go back in time, and drill in to sensors to find out what data they had collected. Your task is to create this demonstrator, where VR interaction allows you to dive in to any sensor and find out what it knows, not simply flying around in 3D, but entering the unexplored fourth Metaverse dimension: time itself!

BS-meter

Client: Christopher Newfield, Independent Social Research Foundation

Recent research has used machine learning methods to apply the language philosophy of Wittgenstein, in a way that can quantify the likelihood of any particular text being bulls**t. These results have extraordinarily exciting implications for political discussion, journalism, corporate press releases, even the content of Facebook or eX-Twitter. Your task is to create a BS-meter that uses these methods to produce an intuitive test device accessible to anyone, perhaps with an international authentication body that can apply validated BS stamps to any text that deserves it.

Care Phone

Client: Stephen Devlin, Sensors CDT

Many older people find the full functionality of a system like Android, with its multiple apps, too confusing or difficult to use. Your task is to design a customisable replacement that can be configured remotely by a trusted person such as a child or sibling, and displays only a very small range of options to the user (e.g. “call Stephen”, “Listen to Radio 4”, “Watch Last episode Eastenders”). There will be difficult design decisions as you trade off generality with security.

Chat-twin

Client: Matthew Postgate, Infometa

LLMs struggle to remember their history of interaction with you, and never update their training weights with real knowledge about your life. However recent research shows how it’s possible to get them to act like an intelligent agent, by maintaining your own description of a simple game world (including other human or LLM players). A highly compact version of this world state is fed back with the prompt for each new round of play. You will use the same strategy to turn an LLM into a digital twin, that keeps the most important records of your life, and helps you to prioritise and complete tasks through natural conversation.

Checkpoint Alternatives

Client: James Golding, Epic Games

In videogame worlds, checkpoints are a welcome way to avoid hours of button-mashing. But the usual convention in a game is that time simply rewinds to the checkpoint, like the eternal repeats in the movie Groundhog Day. What if a checkpoint worked like a functional programming continuation, where state could be directly manipulated before continuing? With the Verse functional programming language integrated into Epic Games infrastructure, this project will be an opportunity to invent new kinds of gameplay. You'll need to think about what parts of the state can be rendered in the game, and how the player can make the game modify itself at this meta-level.

ChessPuzzy

Client: Murad Abdulla, IMC

This challenge is to create a tool that generates chess puzzles (e.g., checkmates in 2 or 3 moves) by analysing real game positions from publicly available datasets or coming up with "fake" positions - although legal positions in chess! Existing chess engines like Leela and Stockfish provide an evaluation score on any move, helping to identify when someone has missed a check mate, or didn't see the best move from a position. These outputs can then be used as a puzzle. Your objective is to use AI methods to generate puzzles, based on patterns learned from real games. The puzzles should be classified as belonging to recognised categories (e.g. pins/skewer puzzles, queen + knight combination puzzles, rook endgame puzzles, etc). You should also assign ratings or difficulty level to each puzzle.

Code Explain AI Assistant

Client: Anastasia Stulova, Nvidia

Large-scale software development projects are often developed by multiple contributors. In certain open-source projects, such as compilers, operating systems or many HPC workloads, developers are frequently located in different parts of the world and are affiliated with various institutions. As a result, these developers may have limited opportunities for direct interaction and must understand code written by others without being able to easily discuss or ask questions to the original authors. Your task is to utilize modern AI technology, particularly large language models (LLMs), to develop a Code Explain AI Assistant. This tool should be capable of providing explanations of specific portions of code, generating helpful code comments, and even answering more advanced queries to help developers navigate and understand the codebase. The tool's applicability can be demonstrated with popular open-source weather or climate codes, such as FV3 or CLOUDSC.

Links: https://github.com/NOAA-EMC/fv3atm https://github.com/ecmwf-ifs/dwarf-p-cloudsc

Computing Physical Calculus

Client: Richard Pawson

In the days when Cambridge innovators like Turing were still inventing the foundations of Computer Science, our department was known as the Mathematical Laboratory, offering a wide range of computation facilities implemented using mechanical devices. One impressive machine, the Differential Analyzer was able to solve calculus problems. The goal of this project is to create a digital working replica of the Differential Analyzer, which can be used both to illustrate the elegant operation of this device, and help students to gain an intuitive understanding of mathematical principles through the visualised behaviour.

Conversational Patient History

Client: Conor Peacock, Alder Hey Children's Hospital

Waiting lists for Autism Spectrum Disorder (ASD) and Attention Deficit Hyperactivity Disorder (ADHD) assessments are growing, partly due to the extensive time specialists spend on developmental history taking-often hours per patient. This project aims to create a conversational AI tool that allows children and their parents to provide comprehensive patient history in a natural, dialogue-based manner. By engaging users in a conversational interface rather than requiring them to fill out lengthy forms, the system can extract necessary clinical information and automatically format it into a report. This innovation seeks to streamline the initial assessment process, saving specialists' time and potentially reducing waiting lists.

CUDA Support for ClangIR

Client: Bruno Cardoso Lopes, Meta

The Clang C++ compiler is currently getting support for a new SSA-based high-level IR representation, offering extra frontend knowledge to the compiler for better analysis and optimizations. This new ClangIR is implemented on top of MLIR and is currently in heavy development. It is mainly focused on CPU support (ARM64, x86_64) but still lacks GPU code generation support. Recently, OpenCL was added to the set of C/C++ compiler extensions supported by ClangIR, but many extensions (HLSL, CUDA, etc.) are still missing. In this project, you will add CUDA support to ClangIR, enabling GPU support for highly thought-after application workloads such as AI and computational science.

Delivery Radar

Client: David Russell, The Fusion Works

Are you ever curious just how fast delivery riders and hacked e-scooters travel down cycle paths? GPS-enabled apps like Strava can tell you how fast you are going yourself, but not how fast somebody else is. In principle, you could automate that. First record some precise reference points on a cycle path of your choice. If planning an official complaint, you might think how to secure that data for later confirmation. Then you'll need some video with accurate timestamps to verify the velocity as the rider's front wheel passes each point. Finally, what is the most effective way to use the aggregated data to campaign for better regulation, as an alternative to motorist campaigns against cycle lanes?

Driverless Humans

Client: Daniel Clarke, Cambridgeshire County Council

Cambridge is at the leading edge of many tech developments, including imminent trials of driverless buses, as well as the data infrastructure driving large status displays like the one by the WGB front door. Your task is to use the data infrastructure to solve one of the biggest problems with driverless buses - who will help wheelchair users, visually and cognitively impaired, or those with other mobility problems to get on and off the bus? Your design solution should integrate location services with realtime data from every bus in Cambridge, to help connect those who need assistance to those able to provide it.

Grasping Concept Spaces

Client: Sam Henshall-West, JAID

Today's most powerful machine learning models, including generative AI LLMs like ChatGPT, encode their knowledge as higher dimensional latent spaces. Vector-based concepts in that space can be clustered as regions on a 2D screen, for example using the t-SNE visualisation algorithm. Your task is to create a 3D version of t-SNE that is interactive, so users with VR headsets can literally “grasp” a concept, and manipulate it to explore, fine-tune, or adjust the machine learning model.

Navigating Evolving Music

Client: Ewan Campbell, Music Faculty

Cambridge composer Ewan Campbell creates cartographic scores where the musical notes are curved and rendered over maps or other pictorial representations of nature. His next project is to use the evolution of musical ideas to reflect the processes of zoological development and mass extinctions. The arboreal subdivision of evolutionary trees very quickly renders traditional notational formats impractical for displaying the array of musical path options available for the performers. Your task will be to create a real-time continuously animating programme that is able to be run simultaneously by several different aligned performers, offering them live musical options, whilst allowing them to control their pathway. Your software will have an initial array of musical fragments to work with, but will also need to be able to accommodate new pieces written to use the software, and subsequently presented in live performances at the Cambridge Festival, where a new nature-based work by Ewan is being commissioned.

Paper Simulator

Client: Tom Frith-Powell, Paper Foundation

Artisanal paper-making is an enjoyable hobby, and popular craft skill ranging from high-end art paper to unique personal notebooks. Cambridge has some of the world’s leading paper researchers, who study the microstructure of the paper in historical books and documents. The goal of this project is to create a physical simulator that models the fibres, fluids and forms of a traditional paper making process. The result will be not only a valuable scientific tool, but potentially a popular plug-in component to the many painting and drawing apps that offer a rather trivial range of simulated paper textures and background images.

Personalised EULA Visualisation

Client: Moinul Zaber, Data and Design Lab, Dhaka

Nobody takes the time to read the End User License Agreements (EULAs) that we all agree to whenever we create an account, install an app, or upgrade an operating system. Different people have different priorities, and although some clauses buried in the EULA might relate to your own concerns about privacy, cost, warranty or whatever, it would take hours to read through and find them. This project will allow users to personalise their own priorities in advance, then use NLP to instantly visualise and navigate the parts of any new EULA that they most want to check - perhaps raising red flags to highlight all the most worrying phrases in a single screen view.

PromptPatrol

Client: Mick Vermeulen, IMC

Language models like ChatGPT and Claude.ai are becoming increasingly popular. These models affect every line of work, including student assignments. For some assignments, professors require students to hand in authentic work for their own personal development, not for the sake of handing in an assignment. In this project, you will design a system that is able to reliably detect text generated by LLMs in student assignments. User experience is a top priority; the system should be fast and simple to use. How will you test, measure or prove this? It should integrate seamlessly with the learning environment of Cambridge. The detection framework could use machine learning, but could an MVP use more simple approaches? What heuristics might determine if text is written by a human? Reports should offer statistics line-by-line, highlighting which parts are human-written or machine-written.

Robot Science Ambassador

Client: Oscar Klock, Furhat Robotics

Imagine a robot science ambassador at the Cambridge Festival that engages visitors in lively discussions about science and their experiences. This project will develop an embodied agent based on the world's most advanced social robot from Furhat Robotics, and using the LEXI framework for speech-based interaction. The system should allow customisation, so that users of all technical abilities can create and deploy their own AI agents able to interact naturally in spoken conversation in a variety of social settings. In the setting of the Cambridge Festival, the humanoid robot would be able to gather feedback, as well as enhancing visitor engagement.

Soft Music Notation

Client: Arild Stenberg, Score Designs Group

Anyone who plays an instrument is familiar with receiving sheet music in a PDF file, but the technical facilities for enhancing that experience are terrible. This project will involve extracting the actual musical semantics (notes and their lengths) from PDF files, encoding these as MusicXML, and allowing the music to be adjusted, personalised, and optimised for performance using the advanced typographic tools created by a prize-winning group project team last year.

Speech Error Detection and Correction

Client: Conor Peacock, Alder Hey Children's Hospital

Children with cleft palate often experience specific speech errors that require targeted feedback to correct. Traditional speech therapy sessions are time-consuming and may not provide immediate, specific guidance during at-home practice. This project aims to develop an interactive tool that detects cleft-related speech errors and offers real-time, personalised feedback. The tool must create a baseline and then learn as the child practices how they are improving. The goal is to support speech therapists by providing patients with an engaging way to practice and improve their speech outside of clinical sessions, enhancing the overall effectiveness of treatment.

Talk Interactive

Client: Robert Shepperson, Global Media

Talk radio is incredibly popular, but doesn’t (yet) have the same capability for discussion and annotation that text formats do. Your task is to create a software service that enhances the experience of listening to talk radio. Your application will listen to the audio, extract key words, and offer services such as background information on an item, programme or presenter, linked visual media, comparison to other news sources and authorities, or even AI-enabled interactive fact-checking about the claims being made.

Training Investigative Interviewers

Client: Ching-Yu Soar Huang, Cambridge Alliance of Legal Psychology

Investigative interviewing is a critical step in gathering evidence for any investigation. The types of questions posed by the investigators can make or break the accuracy of a witness' testimony. Therefore, investigators and legal professionals have to get specialised training and continuous feedback in order to keep up with best interview practice to secure best evidence. However, legal professionals and government organisations are typically working under limited resources. Your task is to build an easily accessible app for them to practice their questioning skills and get instant feedback (e.g., using large language models to detect question types and detect leading/suggestive practice) to help them continuously improve their practice beyond their training sessions. The data set you will be working with are (fully anonymised, of course) real-life child abuse cases investigative interview and court transcripts, which are extremely sensitive materials. So your contribution will really be making a difference for better justice!

Visual Analytics for Hardware Design

Client: Martin Erhart, SiFive

CIRCT recently introduced an open software stack for hardware design, which SiFive has used to build and ship production RISC-V chips. Understanding how hardware designs are optimised and translated to a physical layout is difficult due to the overwhelming amount of data. In this project, you will build a visual data analytics tool that helps hardware compiler engineers analyse the lowering process in detail. We propose to connect the open-source CIRCT (circt.llvm.org) EDA stack with the recently released Google Model Explorer (https://github.com/google-ai-edge/model-explorer) and augment the product with CIRCT-specific interactive analyses, as well as bi-directional analysis-transformations that allow to visualise and influence the compilation process. If successful, your tool will make a complex compilation process visually accessible to a broad community.

Zeitgeist Map

Client: Hrvoje Abramović, IMC

You land in a new country and suddenly you hear new songs on the radio, new songs being played in the club, and you hear people referencing artists you’ve never heard of. People also dress differently and follow trends you never knew existed. Because of the internet and social media, you would expect most of the current trends to be present everywhere, but every country is still different in its own way. The goal of this project is to create a web-based world map with which one can easily visualize how popular different music trends are in certain countries. The user can select either a genre or an artist, and the map lights up with a gradient where they are mostly listened to. Or, conversely, a user selects a specific country (let’s say their next study exchange destination) and can see all the current music trends: top songs, artists, genres. It would be great if the user could also play short snippets of the displayed songs and genres, and even visualize their popularity over time - to see how trends come and go. The basic idea can then also be expanded to include movies, series, and books, if there's time.