In this paper we introduce UniQA, a high-quality Question-Answering data set that comprehends more than 1k documents and nearly 14k QA pairs. UniQA has been generated in a semi-automated manner using the data retrieved from the website of the University of Palermo, covering information about the bachelor and master degree courses for the academic year 2024/2025. Data are both in Italian and English, thus making the data set suitable for QA and translation models. To assess the data, we propose a Retrieval Augmented Generation model based on Llama-3.1-instruct. UniQA can be found at https://github.com/CHILab1/UniQA.

Irene Siragusa, Roberto Pirrone (2024). UniQA: an Italian and English Question-Answering Data Set Based on Educational Documents. In G. Bonetta, C.D. Hromei, l. Siciliani, M.A. Stranisci (a cura di), CEUR Workshop Proceedings. CEUR-WS.

UniQA: an Italian and English Question-Answering Data Set Based on Educational Documents

Irene Siragusa
Primo
;
Roberto Pirrone
Secondo
2024-12-01

Abstract

In this paper we introduce UniQA, a high-quality Question-Answering data set that comprehends more than 1k documents and nearly 14k QA pairs. UniQA has been generated in a semi-automated manner using the data retrieved from the website of the University of Palermo, covering information about the bachelor and master degree courses for the academic year 2024/2025. Data are both in Italian and English, thus making the data set suitable for QA and translation models. To assess the data, we propose a Retrieval Augmented Generation model based on Llama-3.1-instruct. UniQA can be found at https://github.com/CHILab1/UniQA.
dic-2024
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
Irene Siragusa, Roberto Pirrone (2024). UniQA: an Italian and English Question-Answering Data Set Based on Educational Documents. In G. Bonetta, C.D. Hromei, l. Siciliani, M.A. Stranisci (a cura di), CEUR Workshop Proceedings. CEUR-WS.
File in questo prodotto:
File Dimensione Formato  
UniQA.pdf

accesso aperto

Tipologia: Versione Editoriale
Dimensione 1.92 MB
Formato Adobe PDF
1.92 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/678364
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact