skip to content

Department of Computer Science and Technology

A natural language processing PhD student created a TikTok video on how to calculate the best first word in Wordle – the online game of the moment – and it went viral, attracting more than one million views. So now he's followed it up with another video to explain the maths behind his thinking. 

Wordle is a highly addictive game  where players must guess a mystery five-letter word in a maximum six attempts. To make things harder, they are offered no hints (such as an initial letter) on where to start, but must simply dive in with an opening guess. 

Only once they start guessing do they receive guidance on whether they have picked any correct letters, or put them in the correct places in the word. Wordle has been a phenomenon, attracting hundreds of thousands of players and recently being bought for a seven-figure sum by the New York Times.

One of the many people attracted to Wordle was Zébulon Goriely. Zeb obtained both his Bachelor’s and Master’s degrees here in computer science, obtaining a First in both, and has stayed on to do a PhD in natural language processing. His research focuses on using computers to model language in the brain, so when Wordle players took to social media to discuss how to tackle the puzzle, "the whole discussion was absolutely fascinating to me," he says.

One of those players was a linguist who posts on TikTok as @linguisticdiscovery. He shared a video suggesting that to calculate the best starting word, players "need to take into account the frequency of letters in English". He recommended using 'irate' as a start word as 'e', 'a' and 'i' are the most common vowels in English, and 't' and 'r' are the most common consonants.

It's been amazing to see people get excited about how computer science and linguistics can be applied to solving this puzzle.

Zeb, however, disagreed with this approach. "Calculating letter frequencies, or setting scores based on the position of words, get really close – but not quite there, because they’re all just using heuristics." (Heuristics are mental shortcuts used in decision-making such as trial and error, a rule of thumb, or an educated guess.)

Zeb wrote a programme to demonstrate that another approach could work better and posted a TikTok video about it. "If you want to know what the best first word is," he says in the video, "have an existential crisis and waste a whole day of your PhD course writing a program that finds the word that minimises the average number of possible words left after guessing that word."

And the answer is…? "The best possible first word in Wordle is 'roate'. If you guess that, you go from 2,000 possible answers down to 60."

His video took off and has now been viewed over one million times. And so as a follow-up, Zeb created another video explaining the mathematics behind his program.

As my research is all about using computers to model language in the brain, the whole discussion has been absolutely fascinating to me.

Weighted averages
He explains that "to find the average number of words left after each possible guess, I multiplied the probability of each pattern that the game can give (the green, gold and black squares) for each guess by the number of words left after getting that pattern, and added those all together to give a weighted average.

"Doing this calculation, I found 'roate' on average narrows down the possible answers from 2,000 to 60."

His video has sparked lots of responses and ideas on how Zeb could improve his program. And he agrees there is room for improvement. As he admits, "by using the list of the 2,000 possible answers from the game’s code, I am 'cheating' a little bit. A better approach would be to weigh each answer by its occurrence in a huge corpus such as Wikipedia or Google Books.

"Another suggestion is that this is a 'greedy algorithm' – i.e. that I’m assuming at each step that the best attempt is to minimise the number of words remaining. In reality, the 60 words left behind by 'roate' might not be as easy to narrow down further compared to another word. A better attempt would be to search deeper, finding the best pair of words."

It was also noted that Zeb could use the 'entropy' of each guess, rather than the average number of answers left. "If I did that calculation, 'soare' is actually the best first guess (still greedy), with 'roate' in a close second place," Zeb says.

He has enjoyed the discussion of all these approaches and the responses of others to his program.

Solving Wordle using information theory
"There’s a fantastic video by the YouTube channel 3Blue1Brown that addresses all three of these critiques with fantastic visualisations of information theory," Zeb says. "He found 'soare' to maximise entropy for the first step – but looked two steps in to find 'slane' to be better, while 'salet' performed even better when actually running the game on all possible answers. I definitely recommend his two videos for anyone interested in looking deeper into the information theory behind solving the game: Solving Wordle using information theory and Oh, wait, actually the best Wordle opener is not 'crane'…

And what about @linguisticdiscovery, the linguist who first sparked Zeb’s interest and inspired him to write his program? 

"He recently made a fantastic video," Zeb says, "where he discusses how knowing the best first word computationally isn’t necessarily that helpful for humans as we don’t think like computers! We operate with a huge variety of heuristics and biases – such as being better at finding words when the consonants are known, rather than when the vowels are known. He recommended 'stern' – as it contains the most common consonants."


Published by Rachel Gardner on Monday 21st February 2022