Date:
Tuesday, 3 March, 2026 - 15:00 to 16:00
Speaker:
Dr. Michail Mamalakis (University of Cambridge)
Venue:
Computer Laboratory, William Gates Building, Room LT1
In this seminar, we will briefly discuss basic terminology in mechanistic interpretability, examine different sparse auto-encoder (SAEs) architectures and techniques beyond SAEs, and explore examples of LLMs applied in neuroscience, as well as how mechanistic interpretability and attribution methods can be combined to identify new patterns.
Seminar series:
Foundation AI
