PhD Project 13
Updating the CADD tools SwissTargetPrediction and SwissADME to extend their use to natural compounds
Angelos Kollias
Research interest
Within the Zoete Lab, I work on computational natural compounds drug discovery with the aim to create a specialized workflow for their study, mainly on Target and ADME properties prediction. This will accelerate the first in silico evaluation of hard-to-collect molecules as potential modulators of targets of interest before their irreversible use in in vivo experiments or the demanding study of their synthetical accessibility. My research interests focus on drug discovery and its joint with multi-omics data analysis. I am excited by how they can be combined through computational tools to help patients receive innovative therapies faster.
Background
I graduated with a Diploma in Pharmacy from the National and Kapodistrian University of Athens in 2021. By joining a group working on pharmacogenetics during my undergraduate studies, I developed my interest in precision medicine. Parallel to that, I developed my skills in pharmaceutical bioinformatics, and after a period of working as a pharmacist, I started my Ph.D. at the University of Lausanne in January 2023.
Updating the CADD tools SwissTargetPrediction and SwissADME to extend their use to natural compounds
Objectives: Through the research project we will be aiming at upgrading our CADD tools to support their use with naturally occurring compounds. Indeed, SwissTargetPrediction and SwissADME, for instance, have been trained on different sets of drug like molecules, mainly originating from medicinal chemistry literature. Consequently, they have been little trained on natural compounds. Although a large fraction of natural compounds are drug-like according to general rules like those of Lipinski, they exhibit somewhat different molecular properties compared to synthetic compounds (PMID: 22537178, 29711552, 11350252, 12546556). Consequently, machine-learning based approaches which were trained on synthetic compounds, like SwissADME and SwissTargetPrediction, are expected to perform worse on natural products and possibly necessitate a retraining using libraries of natural molecules (PMID: 34584636). SwissTargetPrediction requires a large amount of small molecules for its training, typically some tens to hundreds of thousands. Consequently, its re-training will necessitate the large numbers of natural products that are currently available in the numerous open-access and freely available databases of natural molecules (PMID: 33431011; http://zinc20.docking.org). Crossing these databases with the bio-activities stored in ChEMBL and PubChem will provide us with a sufficiently large dataset of bioactivities for natural products, allowing the adaptation of SwissTargetPrediction to this class of compounds. The novel data provided by WP4 will be used as an external test set to validate the predictive ability of this alternative version of SwissTargetPredicition. SwissADME contains several machine-learning models to predict for instance gastro-intestinal or BBB crossing, P-gp activity or cytochrome inhibition. Again, these models were established on several sets of synthetic small molecules for which these properties were determined and made publicly available. Of note, these models require smaller amount of molecules for their training (typically a few hundred diverse compounds). Consequently, it could be possible to directly use the experimental data produced in WP6 using always the same well-defined and thus reproducible approaches. Such high-quality and well defined data constitute a training set of choice for such prediction models, since they do not come with the usual ‘noise’ existing when using information from different origin and experiments.
Importantly, feedback loops will be put in place between predictions made using (the retrained) SwissTargetPrediction and SwissADME and the experimental data obtained in WP4 and WP6 to optimize the predictive ability of these tools.