In recent years, the debate in the field of applications of Deep Learning to Virtual Screening has focused on the use of neural embeddings with respect to classical descriptors in order to encode both structural and physical properties of ligands and/or targets. The attention on embeddings with the increasing use of Graph Neural Networks aimed at overcoming molecular fingerprints that are short range embeddings for atomic neighborhoods. Here, we present EMBER, a novel molecular embedding made by seven molecular fingerprints arranged as different “spectra” to describe the same molecule, and we prove its effectiveness by using deep convolutional architecture that assesses ligands’ bioactivity on a data set containing twenty protein kinases with similar binding sites to CDK1. The data set itself is presented, and the architecture is explained in detail along with its training procedure. We report experimental results and an explainability analysis to assess the contribution of each fingerprint to different targets.

Mendolia I., Contino S., De Simone G., Perricone U., Pirrone R. (2022). EMBER—Embedding Multiple Molecular Fingerprints for Virtual Screening. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 23(4) [10.3390/ijms23042156].

EMBER—Embedding Multiple Molecular Fingerprints for Virtual Screening

Mendolia I.;Contino S.
;
Pirrone R.
2022-02-15

Abstract

In recent years, the debate in the field of applications of Deep Learning to Virtual Screening has focused on the use of neural embeddings with respect to classical descriptors in order to encode both structural and physical properties of ligands and/or targets. The attention on embeddings with the increasing use of Graph Neural Networks aimed at overcoming molecular fingerprints that are short range embeddings for atomic neighborhoods. Here, we present EMBER, a novel molecular embedding made by seven molecular fingerprints arranged as different “spectra” to describe the same molecule, and we prove its effectiveness by using deep convolutional architecture that assesses ligands’ bioactivity on a data set containing twenty protein kinases with similar binding sites to CDK1. The data set itself is presented, and the architecture is explained in detail along with its training procedure. We report experimental results and an explainability analysis to assess the contribution of each fingerprint to different targets.
15-feb-2022
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni
Mendolia I., Contino S., De Simone G., Perricone U., Pirrone R. (2022). EMBER—Embedding Multiple Molecular Fingerprints for Virtual Screening. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 23(4) [10.3390/ijms23042156].
File in questo prodotto:
File Dimensione Formato  
ijms-23-02156-v2.pdf

accesso aperto

Descrizione: Articolo principale
Tipologia: Versione Editoriale
Dimensione 9.9 MB
Formato Adobe PDF
9.9 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/537660
Citazioni
  • ???jsp.display-item.citation.pmc??? 2
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 6
social impact