Automated and accurate Common Vulnerability Scoring System (CVSS) labeling is required for quick patch processing. Large Language Models (LLMs) have shown impressive capabilities in understanding and generating human language; however, their performance can vary depending on factors like training data and architecture. LLMs may generate biased or irrelevant responses and mis-rank critical flaws. This paper presents CIVS, a collective-intelligence framework that fuses GPT-4 with a fine-tuned GPT-3.5-Turbo via weighted aggregation and ensemble learning. CIVS can match or surpass the accuracy and cost-efficiency of a single large and expensive model while reducing the risk and enhancing the reliability. Evaluated on recent records of the National Vulnerability Database (NVD), CIVS reduces mean-squared error by 10% and improves macro-F1 to 0.76 compared with the strongest individual model. CIVS shows robustness even when challenged with GPT-generated “what-if” variations of vulnerability descriptions. Due to reusing existing models without adding any new trainable parameters, the framework remains cost-efficient while still generalizing to previously unseen vulnerabilities.

Mirtaheri, S.L., Shahbazian, R., Pascucci, V., Movahedkor, N., Pugliese, A. (2025). CIVS: A Collective-Intelligence Ensemble for Automated Software Vulnerability Scoring. IEEE ACCESS, 13 [10.1109/ACCESS.2025.3622663].

CIVS: A Collective-Intelligence Ensemble for Automated Software Vulnerability Scoring

Shahbazian R.;
2025-10-17

Abstract

Automated and accurate Common Vulnerability Scoring System (CVSS) labeling is required for quick patch processing. Large Language Models (LLMs) have shown impressive capabilities in understanding and generating human language; however, their performance can vary depending on factors like training data and architecture. LLMs may generate biased or irrelevant responses and mis-rank critical flaws. This paper presents CIVS, a collective-intelligence framework that fuses GPT-4 with a fine-tuned GPT-3.5-Turbo via weighted aggregation and ensemble learning. CIVS can match or surpass the accuracy and cost-efficiency of a single large and expensive model while reducing the risk and enhancing the reliability. Evaluated on recent records of the National Vulnerability Database (NVD), CIVS reduces mean-squared error by 10% and improves macro-F1 to 0.76 compared with the strongest individual model. CIVS shows robustness even when challenged with GPT-generated “what-if” variations of vulnerability descriptions. Due to reusing existing models without adding any new trainable parameters, the framework remains cost-efficient while still generalizing to previously unseen vulnerabilities.
17-ott-2025
Mirtaheri, S.L., Shahbazian, R., Pascucci, V., Movahedkor, N., Pugliese, A. (2025). CIVS: A Collective-Intelligence Ensemble for Automated Software Vulnerability Scoring. IEEE ACCESS, 13 [10.1109/ACCESS.2025.3622663].
File in questo prodotto:
File Dimensione Formato  
CIVS_A_Collective-Intelligence_Ensemble_for_Automated_Software_Vulnerability_Scoring.pdf

accesso aperto

Tipologia: Versione Editoriale
Dimensione 2.11 MB
Formato Adobe PDF
2.11 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/692277
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact