Triple Negative Breast Cancer (TNBC), defined by the absence of Estrogen Receptor (ER), Progesterone Receptor (PR), and ERythroBlastic oncogene B (ERBB2) expression, represents one of the most aggressive Breast Cancer (BC) subtypes. Its resistance to conventional therapies and high invasiveness present clinical challenges. In the absence of a definitive standard treatment, leveraging Artificial Intelligence (AI) holds promise for advancing pharmaceutical research. Here, we present the development of the BREAST model (Figure 1), designed to predict personalized anticancer therapies for TNBC patients based on their proteomic profiles. Our approach utilized data from The Cancer Genome Atlas (TCGA) [2] and The Cancer Proteome Atlas (TCPA) [3], including genomic, proteomic, and clinical data, to correlate omics signatures with treatment outcomes. A key effort was devoted to cross-matching datasets at the patient level, ensuring that each patient ID was accurately aligned across all dimensions. The methodology employed a variety of ML techniques such as Random Forest (RF), Support Vector Machine (SVM), Neural Networks (NN), Gradient Boosting (GB) and a 5-fold cross-validation to assess their performance using metrics such as Area Under the Curve (AUC) and the 1% Enrichment Factor (1ï). Data analysis showed that the models achieved AUC values of 0.980 ± 0.017 (SVM), 0.998 ± 0.002 (RF), 0.989 ± 0.009 (XGBoost), and 0.984 ± 0.011 (LR). Feature importance analysis highlighted biologically relevant proteins known to be involved in TNBC, confirming the biological plausibility of the model. These findings pave the way for the future application of the BREAST model in hospital settings, with the ultimate goal of supporting oncologists in selecting personalized therapies for TNBC patients.
Bono, A., La Monica, G., Alamia, F., Lauria, A., Martorana, A. (2025). BREAST “Breast canceR Enhanced AI Supported Therapy”: AI-based machine learning algorithms for triple-negative breast cancer personalized treatment. In A. Bono, G. La Monica, F. Alamia, A. Lauria, and A. Martorana, BREAST “Breast canceR Enhanced AI Supported Therapy”: AI-based machine learning algorithms for triple-negative breast cancer personalized treatment, in Proceedings of the Merck Young Chemist’s Symposium 2025, XXIV Ed. L. Alessandroni, M. Bonomo, C. Brondi, A. Cappitti, A. Cerrato, M. Da Pian, A. Dall'Anese, D. Del Giudice, G. Falcone, A. C. Maccari, Marotta, F. Massabò, A. Massaro, M. Mendolicchio, M. Miciaccia, E. Palazzi, L. Paterlini, L. Riva, A. Romero, E. Rossi. (pp. 151-151).
BREAST “Breast canceR Enhanced AI Supported Therapy”: AI-based machine learning algorithms for triple-negative breast cancer personalized treatment
Alessia Bono
Primo
;Gabriele La MonicaSecondo
;Federica Alamia;Antonino LauriaPenultimo
;Annamaria MartoranaUltimo
2025-12-02
Abstract
Triple Negative Breast Cancer (TNBC), defined by the absence of Estrogen Receptor (ER), Progesterone Receptor (PR), and ERythroBlastic oncogene B (ERBB2) expression, represents one of the most aggressive Breast Cancer (BC) subtypes. Its resistance to conventional therapies and high invasiveness present clinical challenges. In the absence of a definitive standard treatment, leveraging Artificial Intelligence (AI) holds promise for advancing pharmaceutical research. Here, we present the development of the BREAST model (Figure 1), designed to predict personalized anticancer therapies for TNBC patients based on their proteomic profiles. Our approach utilized data from The Cancer Genome Atlas (TCGA) [2] and The Cancer Proteome Atlas (TCPA) [3], including genomic, proteomic, and clinical data, to correlate omics signatures with treatment outcomes. A key effort was devoted to cross-matching datasets at the patient level, ensuring that each patient ID was accurately aligned across all dimensions. The methodology employed a variety of ML techniques such as Random Forest (RF), Support Vector Machine (SVM), Neural Networks (NN), Gradient Boosting (GB) and a 5-fold cross-validation to assess their performance using metrics such as Area Under the Curve (AUC) and the 1% Enrichment Factor (1ï). Data analysis showed that the models achieved AUC values of 0.980 ± 0.017 (SVM), 0.998 ± 0.002 (RF), 0.989 ± 0.009 (XGBoost), and 0.984 ± 0.011 (LR). Feature importance analysis highlighted biologically relevant proteins known to be involved in TNBC, confirming the biological plausibility of the model. These findings pave the way for the future application of the BREAST model in hospital settings, with the ultimate goal of supporting oncologists in selecting personalized therapies for TNBC patients.| File | Dimensione | Formato | |
|---|---|---|---|
|
BoA_MYCS2025.pdf
Solo gestori archvio
Tipologia:
Versione Editoriale
Dimensione
512.42 kB
Formato
Adobe PDF
|
512.42 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


