skip to content

Department of Computer Science and Technology

Work on using AI to help computers solve mathematical problems is being presented at a major international conference this week – including the finding that AI systems can now learn maths in English (in addition to logic) and it’s helping them to improve their problem-solving powers.

Autoformalisation with Large Language Models is being presented at the NeurIPS ('Neural Information Processing Systems') 2022 conference in New Orleans this week. The paper’s findings offer a step forward towards the much sought-after goal of autoformalisation, says co-author Professor Mateja Jamnik, Professor of Artificial Intelligence here.

Autoformalisation refers to the task of automatically translating problems from natural language mathematics to a formal language that a computer can understand so it can then prove (or verify) them.

Using machines to do this is much desired as the task of translation is costly and arduous for humans. (When philosophers Alfred North Whitehead and Bertrand Russell produced their three-volume work Principia Mathematica a century ago, it took them the best part of 400 pages to prove that "1+1=2".)

Over the last 70 years, there have been attempts to develop a successful system to formalise and prove mathematical problems using computers. And because computers don’t understand the mix of written text and mathematical notation that mathematicians use, they have been manually translated and then logically checked by computers.

However, "despite decades of research on automation and proof assistants," says Albert Qiaochu Jiang, a PhD student working with Mateja Jamnik and the second author on the paper, "writing formal proofs remains arduous." Hence this work tries to find new solutions to the problem that leverage the power of AI.

"In new research from this Department, we leveraged state-of-the-art machine learning models, enabling AI systems to do mathematics in English as well as in logic," says Mateja Jamnik.

The researchers, including collaborators from Google and Stanford, focused on using Codex, a large language model that had been trained on billions of words to accurately predict text.

It was used to take high school maths problems and translate them from natural language into formal code that computers can use.

The approach proved unexpectedly effective. "We made the surprising observation," says Yuhuai Wu from the research team, "that large language models can correctly translate a significant portion (25.3%) of the problems into a format that is compatible with Isabelle.” Isabelle is a formal mathematical proof assistant.

And what’s exciting, says Wenda Li, a research associate in this department and the third author on the paper, is that when the theorems that had been translated in this way were subsequently proved with machine learning models, it improved the models’ success rates.

They found that the autoformalised theorems help the machine learning models to raise the state-of-the-art success rate on the MiniF2F theorem proving benchmark from 29.6% to 35.2%. 

And in their paper, the researchers argue that this is important because a successful autoformalisation system could unlock the potential of AI for the fields of formal verification, program synthesis, and mathematics.

"You can verify whether a piece of software is doing exactly what you asked it to do," Jiang recently told the New Scientist, "or you can verify hardware chips. So it has applications in financial trading algorithms and hardware design."

A machine capable of mathematical reasoning might be able to do other things as well – including finding new mathematics that humans haven’t yet thought of. 

His interest in this has led Jiang to his current PhD research here. And this week, three papers co-authored by him are being presented at NeurIPS.  

As well as the autoformalisation paper, he is the first author on another paper – Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs – which demonstrates that writing mathematical proofs in English and translating them to a computer language drastically increases the number of problems computers can solve.

And he is the first author on Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers. This paper investigates integrating state-of-the-art machine learning models with decades-old symbolic tools to develop better mathematical-reasoning tools.

"Building a machine that could find new mathematics is the ultimate aim for all this," Jiang says. "In order to find new mathematics, you have to prove and verify them. And making a better mathematical-reasoning machine is a way of helping us get to that goal in the end."


Published by Rachel Gardner on Thursday 1st December 2022