Classical scoring functions may exhibit low accuracy in determining ligand binding affinity for proteins. The availability of both protein–ligand structures and affinity data make it possible to develop machine-learning models focused on specific protein systems with superior predictive performance. Here, we report a new methodology named SAnDReS that combines AutoDock Vina 1.2 with 54 regression methods available in Scikit-Learn to calculate binding affinity based on protein–ligand structures. This approach allows exploration of the scoring function space. SAnDReS generates machine-learning models based on crystal, docked, and AlphaFold-generated structures. As a proof of concept, we examine the performance of SAnDReS-generated models in three case studies. For all three cases, our models outperformed classical scoring functions. Also, SAnDReS-generated models showed predictive performance close to or better than other machine-learning models such as KDEEP, CSM-lig, and ΔVinaRF20. SAnDReS 2.0 is available to download at https://github.com/azevedolab/sandres.

de Azevedo W.F., Quiroga R., Villarreal M.A., da Silveira N.J.F., Bitencourt-Ferreira G., da Silva A.D., et al. (2024). SAnDReS 2.0: Development of machine-learning models to explore the scoring function space. JOURNAL OF COMPUTATIONAL CHEMISTRY [10.1002/jcc.27449].

SAnDReS 2.0: Development of machine-learning models to explore the scoring function space

Tutone M.;
2024-06-02

Abstract

Classical scoring functions may exhibit low accuracy in determining ligand binding affinity for proteins. The availability of both protein–ligand structures and affinity data make it possible to develop machine-learning models focused on specific protein systems with superior predictive performance. Here, we report a new methodology named SAnDReS that combines AutoDock Vina 1.2 with 54 regression methods available in Scikit-Learn to calculate binding affinity based on protein–ligand structures. This approach allows exploration of the scoring function space. SAnDReS generates machine-learning models based on crystal, docked, and AlphaFold-generated structures. As a proof of concept, we examine the performance of SAnDReS-generated models in three case studies. For all three cases, our models outperformed classical scoring functions. Also, SAnDReS-generated models showed predictive performance close to or better than other machine-learning models such as KDEEP, CSM-lig, and ΔVinaRF20. SAnDReS 2.0 is available to download at https://github.com/azevedolab/sandres.
2-giu-2024
de Azevedo W.F., Quiroga R., Villarreal M.A., da Silveira N.J.F., Bitencourt-Ferreira G., da Silva A.D., et al. (2024). SAnDReS 2.0: Development of machine-learning models to explore the scoring function space. JOURNAL OF COMPUTATIONAL CHEMISTRY [10.1002/jcc.27449].
File in questo prodotto:
File Dimensione Formato  
Azevedo_et_al_JCC_2024.pdf

accesso aperto

Tipologia: Pre-print
Dimensione 844.69 kB
Formato Adobe PDF
844.69 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/642676
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 1
social impact