- PhD Student
I am a PhD student at the University of Cambridge (Gonville & Caius College). I previously completed my BA and MEng in Computer Science & Linguistics at Gonville & Caius College, obtaining a “starred First” and a Distinction respectively. I specialise in Machine Learning and Natural Language Processing, exploring alternatives to Transformer-based Large Language Models (LLMs). My academic work spans Machine Learning and Cognitive Science, with a focus on Explainable and Interpretable Machine Learning, and fundamental questions about the human capacity for natural language.
My research is primarily concerned with engineering more cognitively plausible Foundation Models. This emerging research paradigm aims to enhance the cognitive capabilities of cutting-edge computational systems within a cognitively plausible environment. Supervised by Professor Paula Buttery, in my PhD, I am working toward creating cognitively-inspired computational systems, including general-purpose Small-Scale Language Models (SSLMs) that can outperform larger models across several NLP tasks and designing techniques to adapt SSLMs to domain-specific applications.
Biography
Before my PhD, I completed the Linguistics and Computer Science Triposes at the University of Cambridge, where I had the opportunity to work on a funded internship in the ALTA Institute with Prof Paula Buttery, Dr Andrew Caines, Dr Russell Moore and Dr Thiemo Wambsganss, as a Research Assistant on a code-switching project with Dr Li Nguyen, and as a research student with Prof Nigel Collier. My past experience includes work on Multimodal Vision-Language Models in the Language Technology Lab with Prof Nigel Collier and Fangyu Liu (now at Google DeepMind). I have probed vision-language models, such as CLIP, investigating their semantic representations, and explored Nearest Neighbour Algorithms for Offline Imitation Learning (IL). I have also researched Explainable AI, Argumentation Mining, and Shortcut Learning in Natural Language Inference. Within Linguistics, I have interests in Typology (and typological applications in multilingual NLP), Syntactic Theory (especially Neo-Emergentism and Biolinguistics), and Morphological and Phonological Theory.
Outside of academia, I lead Per Capita Media, Cambridge University's newest independent publication supported by a team of students and academics from Cambridge and other academic institutions nationwide, including the University of Oxford and the University of the Arts London. I founded the publication in 2024, with the generous support of Lady Stothard, Dr Ruth Scurr FRSL. My journalistic output has seen me work with The One Show, and liaise with journalists from The Sunday Times and BBC Radio 5Live. I am also involved in student policy think tanks, as the Head of Policy at The Wilberforce Society, the UK's oldest student think tank in the UK based at the University of Cambridge, and organise several speaker events throughout the University. In the past, I have helped organise policy events with the Editor of the BBC Russian Service and the Foreign Minister of Sri Lanka.
Research
Currently, my focus is building small-scale Transformer-based language models and developing curriculum learning (CL) strategies inspired by Language Acquisition frameworks. I am working to build scalable neural architectures capable of combining deep learning systems with theoretical formalisms from Cognitive Science. I am interested in Modular Deep Learning, Explainable AI and Multilingual NLP. Within Linguistics, I have interests in Typology (and typological applications in multilingual NLP), Syntactic Theory (especially Neo-Emergentism and Biolinguistics), and Morphological and Phonological Theory.
Teaching
Guest Lecturer and Teaching Assistant for L95 (ACS/Part III) Introduction to Natural Language Syntax and Parsing (Prof Paula Buttery, Dr Fermin Moscoso del Prado Martin).
Teaching Assistant for Machine Learning & Real World Data (Part IA, Computer Science Tripos)
Delivered a lecture on Language Model Evaluation and Mechanistic Interpretability (Nov 2024).
Co-organised a Phonological Theory Discussion Group with Prof Bert Vaux, 2022-23.
Part III Project Supervisor (2024-5): Natural Language Processing (BabyLM Shared Task)
Machine Learning & Real World Data, Formal Models of Languages
Professional Activities
Co-organiser of the Natural Language & Information Processing (NLIP) Seminars 2024.
Reviewer for the CoNLL BabyLM Shared Task (in EMNLP 2024).
Collaborating with Kinds of Intelligence Programme in the Leverhulme Centre of the Future of Intelligence (CFI) on cognitively-inspired benchmarking and interpretability.
Publications
Key publications:
Less is More: Pre-Training Cross-Lingual Small-Scale Language Models with Cognitively-Plausible Curriculum Learning Strategies.
Suchir Salhan, Richard Diehl-Martinez, Zebulon Goriely, Paula Buttery
In Preparation for CoNLL BabyLM Challenge (Paper Track), 2024 (Accepted, Poster)
On the Potential for Maximising Minimal Means in Transformer Language Models: A Dynamical Systems Perspective.
Suchir Salhan
In Cambridge Occasional Papers in Linguistics, Department of Theoretical & Applied Linguistics , 2023
Other publications:
LLMs “off-the-shelf” or Pretrain-from-Scratch? Recalibrating Biases and Improving Transparency using Small-Scale Language Models.
Suchir Salhan, Richard Diehl-Martinez, Zebulon Goriely, Andrew Caines, Paula Buttery
Learning & Human Intelligence Group, Department of Computer Science & Technology, 2024