skip to content

Department of Computer Science and Technology

A PhD student who would like to encourage more researchers into his – currently under-explored – area of AI research is delighted to have won the support of Google.

Xiangjian Jiang, who is developing foundational AI models to analyse and understand tabular data (i.e. data stored in tables and spreadsheets) has just been named as a 2025 Google PhD Fellow.

This prestigious Fellowship will support his studies for the next two years, providing him not only with funding and computational resources, but also connections with other researchers.

Receiving the two-year Fellowship is a very significant accolade. As Google says, the Fellowships are awarded to "exceptional graduate students pioneering research in computer science and related fields." The goal, it adds, is "supporting the next generation of scientists focused on critical foundational science".

In Xiangjian's case, the foundational work that he is conducting with his supervisors – Mateja Jamnik, Professor of Artificial Intelligence here, and Nikola Simidjievski, Associate Professor of AI for healthcare at Télécom Paris, Institut Polytechnique de Paris – is in developing entirely new AI models that can understand and reason about data derived not from text or images, but from tables.

"This work is both timely and important," Prof Jamnik says. "Tabular data is highly under-explored compared to text and images, yet it's central to many real-world applications. For example, we have medical collaborators who are collecting data that goes beyond text and images, including highly dimensional tables derived from patients' genomic profiles, covering around 20,000 genes. Xiangjian is developing solutions to tackle such complex tabular data, including a tabular foundation model that can handle these challenges effectively."

But it's a demanding challenge and one that requires a step change in the way the AI models are developed.

This area of research is a bit unloved at the moment, but we hope one day that Large Tabular Models will be as important as Large Language Models.

Xiangjian Jaing

 

"A tabular foundation model is not an easy thing to develop," Xiangjian explains. "You can't simply take an existing Large Language Model and adapt it to read a spreadsheet – or apply strategies from analysing textual or image data to analysing tabular data – because there are fundamental differences between the way the information is expressed in these different forms."

And he is hoping that the Fellowship will really help him make progress in his research as well as fostering broader interest in this field. He is well aware that as an area of research, it’s dwarfed by the size of the research community around Large Language Models (LLMs).

"It used to be the case that I would reach out to other researchers about our work and ask them to join our research community," he says. "But hopefully now with the impact of the Google Fellowship, it could be the other way around and there may be more people approaching us." He sees this as a highly positive step. "This area of research is a bit unloved at the moment, but we hope one day that Large Tabular Models will be as important as Large Language Models."

Health and financial data
Currently, huge amounts of important data are stored in the form of tables, ranging from hospital patients' test results to stock prices on the world’s financial markets. So having AI tools that could analyse and understand the tables and then reason about the data and make predictions would be highly valuable. "In the case of health information, for example," Xiangjian points out, "the model could analyse historic health data about diseases and their treatments and then predict which therapies would lead to a good outcome for a patient."

Early on in his research, Xiangjian realised "that tabular data is important, and healthcare information is primarily held as tabular data". His father is a doctor, specialising in diabetes, and "diabetes is a good example of a medical condition that is quite hard to diagnose solely from images or textual descriptions. Lots of tests are needed in order to diagnose it and those results are held in the form of tabular data."

But tabular data is much harder for humans to understand than text or images. "So this is an area where we really need assistance from computers," Xiangjian says. However, it's also hard for AI models currently to process the data from tables, not least because the data is highly heterogeneous.

As Xiangjian puts it, "the columns in a table might have different types of information in them, or be labelled with different names, and you might be working across a whole range of datasets." So a completely new model is needed.

Avoiding misleading predictions
It turns out that you can't ask a Large Language Model that has been trained to read textual data and then predict the next sentence to read a row of data from a table and accurately predict what’s on the next row. "Well, it might come up with a prediction,” Xiangjian laughs, "but it is very likely to provide misleading guidance for practitioners and researchers."

So how can such a tabular model be developed? Xiangjian is focusing on the causal relationships between the data in a table. "That's what really matters," he explains. "The model needs to understand the underlying relationships between the data. If it can uncover the rules behind the way a world operates, and understand them, then it can understand the world itself."

The approach he is taking to uncovering these causal relationships is statistical. "We're looking at the real values of a table and trying to discover the causal relationships across different variables through using statistical methods and techniques. Doing this, we have shown that it can help unlock a fundamental learning capability of tabular models.

Causal relationships
"And once we've uncovered the causal relations, the model can do a lot of things including reasoning about the data and making predictions. And this could lead to a tabular foundation model."

Receiving the Google Fellowship is a great boost to Xiangjian's confidence in his research. "Undoubtedly, this Fellowship will help make more people aware of this under-explored domain of work. And for me personally, that feels like great moral support. It's good to know that this field of research matters to other people and not just me."

His co-supervisor Mateja Jamnik agrees. "I'm very proud he has been awarded the prestigious Google Fellowship," she says. "He's ambitious, creative and technically outstanding in his work and the connections offered through the Google Fellows programme will help broaden his collaborations, his visibility, and ultimately the impact of his research." 


Published by Rachel Gardner on Monday 3rd November 2025