This paper illustrates the implementation of Open Unipa-GPT, an open-source version of the Unipa-GPT chat-bot that leverages open-source Large Language Models for embeddings and text generation. The system relies on a Retrieval Augmented Generation approach, thus mitigating hallucination errors in the generation phase. A detailed comparison between different models is reported to illustrate their performance as regards embedding generation, retrieval, and text generation. In the last case, models were tested in a simple inference setup after a fine-tuning procedure. Experiments demonstrate that an open-source LLMs can be efficiently used for embedding generation, but none of the models does reach the performances obtained by closed models, such as gpt-3.5-turbo in generating answers. Corpora and code are available on GitHub
Siragusa, I., Pirrone, R. (2024). Unipa-GPT: a framework to assess open-source alternatives to Chat-GPT for Italian chat-bots. In F. Dell'Orletta, A. Lenci, S. Montemagni, R. Sprugnoli (a cura di), CEUR Workshop Proceedings. CEUR-WS.
Unipa-GPT: a framework to assess open-source alternatives to Chat-GPT for Italian chat-bots
Irene Siragusa
Primo
;Roberto PirroneSecondo
2024-12-01
Abstract
This paper illustrates the implementation of Open Unipa-GPT, an open-source version of the Unipa-GPT chat-bot that leverages open-source Large Language Models for embeddings and text generation. The system relies on a Retrieval Augmented Generation approach, thus mitigating hallucination errors in the generation phase. A detailed comparison between different models is reported to illustrate their performance as regards embedding generation, retrieval, and text generation. In the last case, models were tested in a simple inference setup after a fine-tuning procedure. Experiments demonstrate that an open-source LLMs can be efficiently used for embedding generation, but none of the models does reach the performances obtained by closed models, such as gpt-3.5-turbo in generating answers. Corpora and code are available on GitHub| File | Dimensione | Formato | |
|---|---|---|---|
|
CLiC_it_unipa_gpt_open-v3.pdf
Solo gestori archvio
Tipologia:
Versione Editoriale
Dimensione
1.61 MB
Formato
Adobe PDF
|
1.61 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


