In response to the continuous growth of software vulnerabilities, this paper introduces a hybrid artificial intelligence (AI) approach that combines large language models (LLMs) with structured cybersecurity knowledge graphs. Our method guides LLMs through explicit, expert-driven rules to predict Common Vulnerability Scoring System (CVSS) Base metrics directly from vulnerability descriptions. Experiments on five state-of-the-art models, including GPT-4o and DeepSeek, demonstrate significant gains. Our proposed knowledge-infused prompts lead to accuracy improvement in GPT-4o from 59% to over 82%, while the accuracy in the predictions of DeepSeek rose from 48% to 78%. Moreover, we show that this integration boosts cost-efficiency for lighter models. These results highlight the practical value of merging symbolic knowledge with LLM reasoning to achieve faster, more consistent, and interpretable vulnerability assessments.
Mirtaheri, S.L., Majd, A., Shahbazian, R., Pugliese, A. (2025). Knowledge-Driven Large Language Models for Automating CVSS Score Prediction. In CheckMATE 2025 - Proceedings of the 2025 Workshop on Research on Offensive and Defensive Techniques in the Context of Man At The End (MATE) Attacks (pp. 20-28). Association for Computing Machinery, Inc [10.1145/3733817.3762699].
Knowledge-Driven Large Language Models for Automating CVSS Score Prediction
Shahbazian R.;
2025-01-01
Abstract
In response to the continuous growth of software vulnerabilities, this paper introduces a hybrid artificial intelligence (AI) approach that combines large language models (LLMs) with structured cybersecurity knowledge graphs. Our method guides LLMs through explicit, expert-driven rules to predict Common Vulnerability Scoring System (CVSS) Base metrics directly from vulnerability descriptions. Experiments on five state-of-the-art models, including GPT-4o and DeepSeek, demonstrate significant gains. Our proposed knowledge-infused prompts lead to accuracy improvement in GPT-4o from 59% to over 82%, while the accuracy in the predictions of DeepSeek rose from 48% to 78%. Moreover, we show that this integration boosts cost-efficiency for lighter models. These results highlight the practical value of merging symbolic knowledge with LLM reasoning to achieve faster, more consistent, and interpretable vulnerability assessments.| File | Dimensione | Formato | |
|---|---|---|---|
|
3733817.3762699.pdf
accesso aperto
Tipologia:
Versione Editoriale
Dimensione
1.32 MB
Formato
Adobe PDF
|
1.32 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


