Today, malware detection represents one of the most critical cybersecurity challenges due to the rapid evolution of threats. One of the most promising approach is the adoption of machine learning (ML) detection methods, nevertheless, their design is not trivial due to the scarcity of up-to-date labeled data. In order to keep up with emerging malware variants, ML-based detection systems must be frequently updated and retrained using recent samples. However, the manual process of feature engineering and expert labeling and analysis is time-consuming and costly, making it impractical for frequent updates. This work presents TrustBoot, a semi-supervised framework for detecting malicious software, that exploits the exact knowledge only about a small set of trusted applications, and is capable of processing a larger set of unlabeling applications. To achieve this goal, TrustBoot adopts a visual encoding of binary executable, that eases the detection of anomalies, which are related to the p resence of malware. Experiments on large Android malware datasets demonstrate that the proposed pipeline achieves competitive detection performance, matching or exceeding fully supervised approaches while substantially reducing the need for manual intervention for the dataset curation and overcoming the reliance on labeled malicious data.

Augello, A., De Paola, A., Lo Re, G. (2026). TrustBoot: A Trust Bootstrapping Framework for Semi-Supervised Malware Detection. In Proceedings of the 18th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART (pp. 1936-1947) [10.5220/0014465700004052].

TrustBoot: A Trust Bootstrapping Framework for Semi-Supervised Malware Detection

Augello, Andrea
;
De Paola, Alessandra;Lo Re, Giuseppe
2026-03-01

Abstract

Today, malware detection represents one of the most critical cybersecurity challenges due to the rapid evolution of threats. One of the most promising approach is the adoption of machine learning (ML) detection methods, nevertheless, their design is not trivial due to the scarcity of up-to-date labeled data. In order to keep up with emerging malware variants, ML-based detection systems must be frequently updated and retrained using recent samples. However, the manual process of feature engineering and expert labeling and analysis is time-consuming and costly, making it impractical for frequent updates. This work presents TrustBoot, a semi-supervised framework for detecting malicious software, that exploits the exact knowledge only about a small set of trusted applications, and is capable of processing a larger set of unlabeling applications. To achieve this goal, TrustBoot adopts a visual encoding of binary executable, that eases the detection of anomalies, which are related to the p resence of malware. Experiments on large Android malware datasets demonstrate that the proposed pipeline achieves competitive detection performance, matching or exceeding fully supervised approaches while substantially reducing the need for manual intervention for the dataset curation and overcoming the reliance on labeled malicious data.
mar-2026
978-989-758-796-2
Augello, A., De Paola, A., Lo Re, G. (2026). TrustBoot: A Trust Bootstrapping Framework for Semi-Supervised Malware Detection. In Proceedings of the 18th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART (pp. 1936-1947) [10.5220/0014465700004052].
File in questo prodotto:
File Dimensione Formato  
TrustBoot- A Trust Bootstrapping Framework for Semi-Supervised Malware Detection.pdf

Solo gestori archvio

Tipologia: Versione Editoriale
Dimensione 4.63 MB
Formato Adobe PDF
4.63 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/702324
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact