skip to content

Department of Computer Science and Technology

  • PhD Student

I am a PhD student at the University of Cambridge (Gonville & Caius College). I previously completed my BA and MEng in Computer Science & Linguistics at Gonville & Caius College, obtaining a “starred First” and a Distinction respectively. I specialise in Machine Learning and Natural Language Processing, exploring alternatives to Transformer-based Large Language Models (LLMs). My academic work spans Machine Learning and Cognitive Science, with a focus on Explainable and Interpretable Machine Learning, and fundamental questions about the human capacity for natural language. 

My research is primarily concerned with engineering scalable, data-efficient Small Language Models, specifically hybrid state-space/Transformer architectures, and cognitively-inspired AI. This emerging research paradigm aims to enhance the cognitive capabilities of cutting-edge computational systems within a cognitively plausible environment.

Supervised by Professor Paula Buttery, in my PhD, I am working toward creating cognitively-inspired computational systems, leveraging insights from human cognition to benchmark and interpret state-of-the-art Language Models, and build more adaptive Language Models for small-scale data regimes.  

Biography

Before my PhD, I completed the Linguistics and Computer Science Triposes at the University of Cambridge, where I had the opportunity to work on a funded internship in the ALTA Institute with Prof Paula Buttery, Dr Andrew Caines, Dr Russell Moore and Dr Thiemo Wambsganss, as a Research Assistant on a code-switching project with Dr Li Nguyen, and as a research student with Prof Nigel Collier. My past experience includes work on Multimodal Vision-Language Models in the Language Technology Lab with Prof Nigel Collier and Fangyu Liu (now at Google DeepMind). I have probed vision-language models, such as CLIP, investigating their semantic representations, and explored Nearest Neighbour Algorithms for Offline Imitation Learning (IL). I have also researched Explainable AI, Argumentation Mining, and Shortcut Learning in Natural Language Inference. Within Linguistics, I have interests in Typology (and typological applications in multilingual NLP), Syntactic Theory (especially Neo-Emergentism and Biolinguistics), and Morphological and Phonological Theory. 

Outside of academia, I lead Per Capita Media, Cambridge University's newest independent publication supported by a team of students and academics from Cambridge and other academic institutions nationwide, including the University of Oxford and the University of the Arts London. I founded the publication in 2024, with the generous support of Lady Stothard, Dr Ruth Scurr FRSL. My journalistic output has seen me work with The One Show, and liaise with journalists from The Sunday Times and BBC Radio 5Live. I am also involved in student policy think tanks, as the Head of Policy at The Wilberforce Society, the UK's oldest student think tank in the UK  based at the University of Cambridge, and organise several speaker events throughout the University. In the past, I have helped organise policy events with the Editor of the BBC Russian Service and the Foreign Minister of Sri Lanka.

Research

Small Language Models: The viability of 'Small LMs' as a coherent research programme relies on a successful consideration of efficiency, acceleration and architectural questions. There is a growing recognition that the computational complexity of self-attention in Transformers is suboptimal in various respects. In particular, I have interests in leveraging small LMs for multilingual NLP and domain-specific applications. 
 

Cognitively-Inspired AI: The emergent capabilities of Transformers are subject to a great deal of interpretability work, however there is a clear mismatch between human language acquisition (which is data-efficient in many regards) and the data-hungriness of Transformers. I am personally very invested in research questions that draw on insights from language acquisition to guide architectural alternatives to 'vanilla' Transformers. Currently, my focus is building small-scale Transformer-based language models and developing curriculum learning (CL) strategies inspired by Language Acquisition frameworks.

I have interests in Reinforcement Learning, Theory of Deep Learning and Geometric Deep Learning. Within Linguistics, I have particular interests in Typology (and typological applications in multilingual NLP), Syntactic Theory (especially Neo-Emergentism and Biolinguistics), and Morphological and Phonological Theory.

Teaching

Guest Lecturer and Teaching Assistant

Guest Lecturer and  Teaching Assistant for L95 (ACS/Part III) Introduction to Natural Language Syntax and Parsing (Prof Paula Buttery, Dr Fermin Moscoso del Prado Martin).

Teaching Assistant for Machine Learning & Real World Data (Part IA, Computer Science Tripos) 

Delivered a lecture on Language Model Evaluation and Mechanistic Interpretability (Nov 2024). 

Thesis & Research Supervision

Supervisor for MPhil Dissertation on Small Language Models (Vision-Language Models) and Learning Dynamics. 

Other

Co-organised a Phonological Theory Discussion Group with Prof Bert Vaux, 2022-23. 

 

Supervisions

Machine Learning and Bayesian Inference (Part II, Computer Science Tripos)

Formal Models of Language (Part IB, Computer Science Tripos)

Artificial Intelligence (Part IB, Computer Science Tripos)

Probability (Part IA, Computer Science Tripos)

Professional Activities

Co-organiser of the Natural Language & Information Processing (NLIP) Seminars 2024. 

Reviewer for the CoNLL BabyLM Shared Task (in EMNLP 2024).

Collaborating with Kinds of Intelligence Programme in the Leverhulme Centre of the Future of Intelligence (CFI) on cognitively-inspired benchmarking and interpretability. 

Publications

Key publications: 

Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies.
Suchir SalhanRichard Diehl-MartinezZebulon GorielyPaula Buttery
In Preparation for CoNLL BabyLM Challenge (Paper Track), 2024 (Accepted, Poster)

On the Potential for Maximising Minimal Means in Transformer Language Models: A Dynamical Systems Perspective.
Suchir Salhan
In Cambridge Occasional Papers in Linguistics, Department of Theoretical & Applied Linguistics , 2023

Other publications: 

LLMs “off-the-shelf” or Pretrain-from-Scratch? Recalibrating Biases and Improving Transparency using Small-Scale Language Models.
Suchir SalhanRichard Diehl-MartinezZebulon GorielyAndrew CainesPaula Buttery
Learning & Human Intelligence Group, Department of Computer Science & Technology, 2024

Contact Details

Room: 
GS08
Office address: 
Gonville & Caius College, Trinity St, Cambridge CB2 1TA
Email: 

sas245@cam.ac.uk