Part II Project suggestions from Pietro Lio
Titolo Using Graph Kernels in GNN and Knowledge graphs
Integrating data from several modalities is extremely relevant in computational biology, as it allows the emergence of system-level insights. Several approaches exist for integrating multi-modal data and learning from it. Graph representation methods stand out, as graphs are a natural way to represent biological systems, which are usually composed of complex networks of interacting elements. Graph kernels provide interesting ways of inputting domain knowledge into the predictions. However, graph neural networks are more flexible and can work with more complex inputs. A GNN based model able to learn to execute the high-level operations defined by a kernel, could therefore leverage domain knowledge to achieve even better performance in tasks such as recommendation networks, drug discovery and the detection of gene-disease correlations and operate on a more diverse set of data.
Tran, Van Dinh, Alessandro Sperduti, Rolf Backofen, and Fabrizio Costa. "Heterogeneous networks integration for disease–gene prioritization with node kernels." Bioinformatics36, no. 9 (2020): 2649-2656.
Originators: Pietro Lio (pl219@cam.ac.uk), Professor Fabrizio Costa (f.costa@exeter.ac.uk)
Improving robustness of quantitative magnetic resonance imaging with XXX in subjects with Multiple Sclerosis
Multiple Sclerosis (MS) is an acquired autoimmune demyelinating disease affecting the brain and spinal cord, leading to cognitive decline and severe disability1. Subjects with MS are diagnosed and monitored using Magnetic Resonance Imaging, however, in daily clinical practice, images are only reviewed in a qualitative fashion.
Several quantitative MRI (qMRI) techniques, capable of in vivo quantification of imaging biomarkers, have been proposed to more accurately explore brain microstructure2 and metabolism3. In MS, these techniques have the potential to reveal pre-clinical inflammatory demyelination, affording a new therapeutic window. However, to translate in clinical practice, a quantitative imaging biomarker should be reproducible, robust and accurate, and, at present, such a biomarker is still an unmet need.
This project will focus on Quantitative Susceptibility Mapping (QSM)4,5, a qMRI technique sensitive to differences between the magnetic responses of adjacent tissues that can estimate the concentration of several substances in vivo potentially revealing pre-clinical disease progression and enabling earlier treatment.
Preliminary results from a study lead by one of the co-supervisors of this project demonstrated higher variance along some clinically-relevant anatomical structures and in regions adjacent to areas prone to artifacts undermining the potential utility of the technique6. This project will use XXX to improve the robustness of qMRI data decreasing the variance due to noise and, in turn, allowing for a more reliable quantitative imaging biomarker.The dataset is a pre-existing cohort of 100 subjects with MS and 50 healthy control subjects with already available imaging and clinical data6,7.
References
- Reich, DS et al. Multiple Sclerosis. N. Engl. J. Med. 2018; 378(2), 169–180.
- Sowa, P et al. Restriction spectrum imaging of white matter and its relation to neurological disability in multiple sclerosis. Mult. Scler. 2019; 25(5), 687–698.
- Grist, J T et al. Imaging intralesional heterogeneity of sodium concentration in multiple sclerosis: Initial evidence from 23Na-MRI. J. Neurol. Sci. 2018; 387, 111–114.
- Reichenbach, J. R. The future of susceptibility contrast for assessment of anatomy and function. NeuroImage 2012; 62(2), 1311–1315.
- Deistung, A., Schweser, F. & Reichenbach, J. R. Overview of quantitative susceptibility mapping. NMR Biomed. 30: e3569. doi: 10.1002/nbm.3569.
- Fiscone, C et al. Assessing robustness of quantitative susceptibility-based MRI radiomic features in patients with multiple sclerosis. Sci Rep 2023; 13: 1–16.
- Fiscone, C et al. Multiparametric MRI dataset for susceptibility-based radiomic feature extraction and analysis. Sci data 2024; 11: 575.
Supervisors: Prof Pietro Lio’ (pl219@cam.ac.uk), Dr Fulvio Zaccagna (fz47@cam.ac.uk)
Complex diseases bioinformatics and machine learning
This project’s main focus will be on the integration of complex disease data taken from patients with Down syndrome or Multiple Sclerosis, with molecular RNA sequence data and clinical data. As an extension, this will then be used to train Graph Neural Networks (GNNs) used in forming a ’digital twin’ of a patient. A digital twin is defined as a set of virtual information constructs that mimics the structure, context and behaviour of an individual or unique physical asset, that is dynamically updated with data from its physical twin throughout its life cycle, and that ultimately informs decisions that realize value. As modern medicine shifts to providing more precise and personalised treatment plans for patients, the importance of using digital twins to inform high-stakes decisions will only grow.
A combination of RNA sequence, methylation, metagenomic and clinical data will be used for this project and they all have arrived and are readily available. My supervisor has already obtained all of the ethical approvals needed to use this data. The aim of the project is to integrate together all of this data using a combination of deep learning and bioinformatics techniques. The project will involve analysing transcriptome and DNA methylation data of Down syndrome or Multiple Sclerosis patients to identify novel gene regulatory mechanisms associated with comorbidities in Down syndome/Multiple sclerosis patients. The student will implement higher order neural networks and dynamical causal networks.
Details and References from Pietro Lio’, pl219@cam.ac.uk
Proposal for Enhancing Usability of ERnet through Advanced Pre-processing
Background:
The ERnet (https://pubmed.ncbi.nlm.nih.gov/36997816/), built upon the robust foundation of the Swin Transformer architecture, represents a cutting-edge solution for the segmentation of the endoplasmic reticulum (ER), excelling particularly with super-resolved ER structures.
Problem Statement:
ERnet achieves high performance on high-quality image inputs, such as SIM images and high SNR widefield or confocal data. However, widefield and confocal data with poor imaging quality such as low SNR remains incompatible with the ERnet processing framework.
Aim:
To address this challenge, our objective is to design and implement a sophisticated pre-processor capable of transforming lower quality images into high-definition equivalents. Leveraging denoising methodologies, such as the diffusion model, will be pivotal in achieving this transformation.
Overall Direction:
By integrating principles of super-resolution from computer vision, our intent is to upgrade wide-field and confocal data inputs, thus broadening the applicability and usability of ERnet. Through this enhancement, we aim to make ERnet a universally compatible tool, ensuring researchers can harness its capabilities irrespective of their data quality.
Validation: against current version (and others such segment anything) and against ground truth
Supervisors: Pietro Lio' (pl219@), Edward Ward (ew535@), Meng Lu (ml600@)