Archivio istituzionale della ricerca dell'Università degli Studi di Palermo

The goal of a text simplification system (TS) is to create a new text suited to the characteristics of a reader, with the final goal of making it more understandable.The building of an Automatic Text Simplification System (ATS) cannot be separated from a correct evaluation of the text complexity. In fact the ATS must be capable of understanding if a text should be simplified for the target reader or not. In a previous work we have presented a model capable of classifying Italian sentences based on their complexity level. Our model is a Long Short Term Memory (LSTM) Neural Network capable of learning the features of easy-to-read and complex-to-read sentences autonomously from a annotated corpus created specifically for text simplification. In this paper we further investigate on the role of the text representation, i.e. how different ways of representing the input text can affect the accuracy of the proposed system. In detail, we will use our Neural Network model for evaluating the sentence complexity using different kind of representations such as GloVe, Word2vec, FastTex and a new one based on a representation learning scheme.

Lo Bosco, G., Pilato, G., Schicchi, D. (2018). A Neural Network model for the Evaluation of Text Complexity in Italian Language: a Representation Point of View. PROCEDIA COMPUTER SCIENCE, 145, 464-470 [10.1016/j.procs.2018.11.108].

A Neural Network model for the Evaluation of Text Complexity in Italian Language: a Representation Point of View

Lo Bosco, Giosué;Pilato, Giovanni;Schicchi, Daniele

2018-01-01

Abstract

The goal of a text simplification system (TS) is to create a new text suited to the characteristics of a reader, with the final goal of making it more understandable.The building of an Automatic Text Simplification System (ATS) cannot be separated from a correct evaluation of the text complexity. In fact the ATS must be capable of understanding if a text should be simplified for the target reader or not. In a previous work we have presented a model capable of classifying Italian sentences based on their complexity level. Our model is a Long Short Term Memory (LSTM) Neural Network capable of learning the features of easy-to-read and complex-to-read sentences autonomously from a annotated corpus created specifically for text simplification. In this paper we further investigate on the role of the text representation, i.e. how different ways of representing the input text can affect the accuracy of the proposed system. In detail, we will use our Neural Network model for evaluating the sentence complexity using different kind of representations such as GloVe, Word2vec, FastTex and a new one based on a representation learning scheme.

Scheda breve

Scheda completa

Scheda completa (DC)

	Data
	
				2018
			
	Titolo del periodico 
DATO PREVISTO SU LOGINMIUR
	
				PROCEDIA COMPUTER SCIENCE
			
	DOI del contributo 
DATO PREVISTO SU LOGINMIUR
	
				https://dx.doi.org/10.1016/j.procs.2018.11.108
			
	Citazione
	
				Lo Bosco, G., Pilato, G., Schicchi, D. (2018). A Neural Network model for the Evaluation of Text Complexity in Italian Language: a Representation Point of View. PROCEDIA COMPUTER SCIENCE, 145, 464-470 [10.1016/j.procs.2018.11.108].
			
	Appare nelle tipologie:
	
				1.01 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S1877050918323962-main.pdf accesso aperto Dimensione 347.74 kB Formato Adobe PDF Visualizza/Apri	347.74 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/328126

Citazioni

ND

14

12

social impact