skip to content

Department of Computer Science and Technology

My current research interests roughly lie on the intersection of machine learning, representation learning, and AI applications in healthcare. More specifically, I am interested in (1) the design of learning algorithms that are able to construct explanations for their predictions in terms of high-level concepts and (2) the broad applications that said algorithms may have in scenarios where transparency is not an option (such as in healthcare) and on-site human feedback may be available.

Research

I am broadly interested in the following four core research subfields within AI and ML:

  1. Explainable Artificial Intelligence (XAI)
  2. Interpretable Deep Neural Architectures
  3. Concept-based Explainability
  4. Representation Learning

Teaching

  • Lent 2023: Explainable Artificial Intelligence (MPhil Submodule for 255: Advanced Topics in Machine Learning)
  • Michaelmas and Lent 2021 - Present: Discrete Mathematics

MPhil/Part III Project Proposals

Below are some projects I would be delighted to supervise during the 2023-2024 academic years! If any of these sound interesting to you, or you would like to discuss some variations of these ideas, please feel free to contact me.

Project 1 -- Ain’t Nobody Got Time For That: Budget-aware Concept Intervention Policies

This project will be co-supervised by Prof Mateja Jamnik and Dr Zohreh Shams

As Deep Neural Networks (DNNs) continue to outperform competing methods in a growing number of fields, there has been a growing concern regarding the ethical and legal use of DNNs in sensitive tasks (e.g., healthcare). These concerns have inspired the development of interpretable-by-construction neural architectures that explain their explanations using “high-level concepts” (e.g., “has paws”, “has whiskers”, etc). A crucial framework for constructing these architectures is what are referred to as Concept Bottleneck Models (CBMs) [1]. CBMs are neural architectures which can be decomposed into two sequential subcomponents: (i) a concept encoder network which maps input features to a vector (i.e., a bottleneck) in which each every activation represents whether a specific concept is “off” or “on” (e.g., one of the neurons could represent “has whiskers” while another represents “has tail”), and (ii) a label predictor network which maps the concept bottleneck to an output label for a task of interest (e.g., whether the image is that of a “dog” or a “cat”). The utility of these models arises as, when they predict a sample’s task, they must always first generate a bottleneck of concept activations which serve as an explanation for the CBM’s output prediction (e.g., if the CBM predicted that an image has a “cat” in it, a possible concept explanation is that it found the concepts “whiskers”, “long ears”, and “paws” to be active in the input image).

A key property of CBMs, and the core property we will explore in this project, is that they enable expert “concept interventions” at test time: during inference, an expert interacting with the CBM can analyse the concept explanation it generates for a prediction and can then correct one or multiple mispredicted concepts before passing the updated bottleneck to the CBM’s label predictor. This enables the CBM to potentially update its original prediction to consider the expert knowledge provided to it at test time, leading to significant improvements in performance when deployed in conjunction with an expert [1, 2]. Further work [3], however, has shown that the order in which concepts are intervened at test-time can have significant effects on their effectiveness. Therefore, recent research [4] has begun to consider designing intervention policies that indicate which concepts one should request from a user to maximise their potential impact on the output prediction. Nevertheless, these frameworks have been preliminary and have only explored greedy policies, which have been shown to lag behind known optimal policies (even greedy ones).

In this project, you will be exploring how to design and learn such intervention policies so that you can (1) close the gap between the resulting policy and a known optimal policy, and (2) incorporate real-world constraints into the intervention process such as cost budgets and uncertainty (real-world experts don’t have infinite patience, certainty, and time!). An initial research direction may involve exploring non-greedy intervention policies (say via deep reinforcement learning) that can take into account predefined “intervention budgets” (the number of interventions an expert may be able to afford to do) to produce a list of concepts whose values may help reducing the CBM’s predictive uncertainty the most. For inspiration, you can start by looking at the references provided below to read on some recent developments in this area as well as some related work in active feature acquisition [5].

Through this project we hope a candidate may (i) develop their own knowledge within the fields of explainable AI (XAI), reinforcement learning, and representation learning, all highly relevant and growing fields of study; (ii) get some hands-on practice with code development as well as deep learning architecture design, training, and deployment (highly desirable in both academia and industry); and (iii) get comfortable with analysing and designing interpretable architectures which can have practical use in critical-sensitive tasks where interpretability is paramount.

The ideal candidate for this project will have a strong background in deep learning and mathematics (or a strong drive and the mathematical maturity to pick up these concepts quickly). Some familiarity with traditional XAI (feature importance methods, saliency maps, prototype explainability, etc), deep reinforcement learning, or concept learning/representation learning is a big plus.

References

[1] Koh, Pang Wei, et al. "Concept bottleneck models." International Conference on Machine Learning. PMLR, 2020.

[2] Espinosa Zarlenga, Mateo, et al. "Concept embedding models: Beyond the accuracy-explainability trade-off." Advances in Neural Information Processing Systems 35 (2022): 21400-21413.

[3] Shin, Sungbin, et al. "A closer look at the intervention procedure of concept bottleneck models."  International Conference on Machine Learning. PMLR, 2023.

[4] Chauhan, Kushal, et al. "Interactive concept bottleneck models." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 37. No. 5. 2023.

[5] Li, Yang, and Junier Oliva. "Active feature acquisition with generative surrogate models." International Conference on Machine Learning. PMLR, 2021.

 

Project 2 -- Train it, Extend it, Retrain it: Iterative Concept Discovery

This project will be co-supervised by Prof Mateja Jamnik

As Deep Neural Networks (DNNs) continue to outperform competing methods in a growing number of fields, there has been a growing concern regarding the ethical and legal use of DNNs in sensitive tasks (e.g., healthcare). These concerns have inspired the development of interpretable-by-construction neural architectures that explain their explanations using “high-level concepts” (e.g., “has paws”, “has whiskers”, etc). Concept Embedding Models (CEMs) [1] are a recent example of such neural architectures where a DNN is trained first to learn a set of “concept embeddings” that represent the activation or inactivation of known concepts and then use those embeddings to predict a task of interest. The utility of these models arises from the fact that at inference time they first predict a set of concept embeddings whose semantical alignment can be used to explain the CEM’s output prediction (e.g., if the CEM predicted that an image has a “cat” in it, a possible concept explanation is that it first predicted embeddings that indicate that the concepts “whiskers”, “long ears”, and “paws” are active in the input image).

A crucial limitation of CEMs is that one requires a set of concept annotations at train-time to learn/align concept embeddings correctly. Nevertheless, such annotations are often costly to attain or, in some circumstances, even impossible, as the complete set of relevant concepts for a given task is sometimes ill-defined. Nevertheless, it has been hypothesized that CEMs may be capturing concepts not provided at train time as part of their learnt embedding spaces [1]. If true, this allows the construction of more complete explanations using concepts not included during training if such concepts can be extracted from the learnt concept embeddings and assigned semantics via some post-hoc expert analysis. In this project, you will explore this precise question by trying to understand whether unseen concepts are indeed encoded as part of the concept embeddings generated by a CEM and, if so, how such concepts may be extracted and used to construct more complete explanations for a CEM’s predictions. Particularly, this project will explore whether these discovered concepts can be iteratively reintroduced into a CEM’s training process as training-time concepts after they have been discovered through some post-hoc analysis (e.g., via clustering or dimensionality reduction). If successful, this will enable the creation of more interpretable models and the ability to discover valuable concepts beyond those provided as training annotations. For inspiration, you can start by looking at the references provided below to read on recent developments in this area and related work in which models similar to CEMs were shown to discover some concepts in tabular domains automatically [2].

Through this project we hope a candidate may (i) develop their own knowledge within the fields of explainable AI (XAI) and representation learning, both highly relevant and growing fields of study; (ii) get some hands-on practice with code development as well as deep learning architecture design, training, and deployment (highly desirable in both academia and industry); and (iii) get comfortable with analysing and designing interpretable architectures which can have practical use in critical-sensitive tasks where interpretability is paramount.

The ideal candidate for this project will have a strong background in deep learning and mathematics (or a strong drive and the mathematical maturity to pick up these concepts quickly). Some familiarity with traditional XAI (feature importance methods, saliency maps, prototype explainability, etc) or concept learning/representation learning is a big plus.

References

[1] Espinosa Zarlenga, Mateo, et al. "Concept embedding models: Beyond the accuracy-explainability trade-off." Advances in Neural Information Processing Systems 35 (2022): 21400-21413.

[2] Espinosa Zarlenga, Mateo, et al. "TabCBM: Concept-based Interpretable Neural Networks for Tabular Data." Transactions on Machine Learning Research (2023).

[3] Oikarinen, Tuomas, et al. "Label-Free Concept Bottleneck Models." ICLR (2023).

[4] Kim, Eunji, et al. "Probabilistic Concept Bottleneck Models."  ICML (2023).

Publications

  • Mateo Espinosa Zarlenga, Katherine M. Collins, Krishnamurthy Dj Dvijotham, Adrian Weller, Zohreh Shams, Mateja Jamnik.  (2023). "Learning to Receive Help: Intervention-Aware Concept Embedding Models." To appear in Advances in Neural Information Processing Systems 36 (2023).
  • Mateo Espinosa Zarlenga, Zohreh Shams, Michael Edward Nelson, Been Kim, and Mateja Jamnik.  (2023). "TabCBM: Concept-based Interpretable Neural Networks for Tabular Data." Transactions on Machine Learning Research.
  • Pietro Barbiero, Gabriele Ciravegna, Francesco Giannini, Mateo Espinosa Zarlenga, Charlotte Lucie Magister, Alberto Tonda, Petro Lio, Frederic Precioso, Mateja Jamnik, and Guiseppe Marra. (2023). "Interpretable Neural-Symbolic Concept Reasoning." Proceedings of the 40th International Conference on Machine Learning in Proceedings of Machine Learning Research 202:1801-1825.

  • Katherine M Collins, Matthew Barker*, Mateo Espinosa Zarlenga*, Naveen Raman*, Umang Bhatt, Mateja Jamnik, Ilia Sucholutsky, Adrian Weller, and Krishnamurthy Dvijotham. (2023). "Human Uncertainty in Concept-Based AI Systems." Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society.
  • Mateo Espinosa Zarlenga*, Pietro Barbiero*, Zohreh Shams*, Dmitry Kazhdan, Umang Bhatt, Adrian Weller, and Mateja Jamnik. (2023). “Towards Robust Metrics for Concept Representation Evaluation”. Proceedings of the AAAI Conference on Artificial Intelligence 37 (10):11791-99.
  • Mateo Espinosa Zarlenga*, Pietro Barbiero*, Gabriele Ciravegna, Giuseppe Marra, Francesco Giannini, Michelangelo Diligenti, Zohreh Shams, Frederic Precioso, Stefano Melacci, Adrian Weller, Pietro Lio, and Mateja Jamnik. (2022). "Concept embedding models: Beyond the accuracy-explainability trade-off." Advances in Neural Information Processing Systems 35 (2022): 21400-21413.
  • Mateo Espinosa Zarlenga, Zohreh Shams, and Mateja Jamnik. (2021). "Efficient decompositional rule extraction for deep neural networks." NeurIPS 2021 Workshop on eXplainable AI approaches for debugging and diagnosis (XAI4Debugging).

Contact Details

Room: 
FE14
Office address: 
Computer Laboratory, 15 JJ Thomson Ave, Cambridge CB3 0FD
Office phone: 
(01223) 7-63533
Email: 

me466@cam.ac.uk