Visually grounded human-robot interaction is recognized to be an essential ingredient of socially intelligent robots, and the integration of vision and language increasingly attracts attention of researchers in diverse fields. However, most systems lack the capability to adapt and expand themselves beyond the preprogrammed set of communicative behaviors. Their linguistic capabilities are still far from being satisfactory which make them unsuitable for real-world applications. In this paper we will present a system in which a robotic agent can learn a grounded language model by actively interacting with a human user. The model is grounded in the sense that meaning of the words is linked to a concrete sensorimotor experience of the agent, and linguistic rules are automatically extracted from the interaction data. The system has been tested on the NAO humanoid robot and it has been used to understand and generate appropriate natural language descriptions of real objects. The system is also capable of conducting a verbal interaction with a human partner in potentially ambiguous situations.

Zambuto, D., Dindo, H., Chella, A. (2010). Visually-Grounded Language Model for Human-Robot Interaction. INTERNATIONAL JOURNAL OF COMPUTATIONAL LINGUISTICS RESEARCH, 1:3, 105-115.

Visually-Grounded Language Model for Human-Robot Interaction

ZAMBUTO, Daniele;DINDO, Haris;CHELLA, Antonio
2010-01-01

Abstract

Visually grounded human-robot interaction is recognized to be an essential ingredient of socially intelligent robots, and the integration of vision and language increasingly attracts attention of researchers in diverse fields. However, most systems lack the capability to adapt and expand themselves beyond the preprogrammed set of communicative behaviors. Their linguistic capabilities are still far from being satisfactory which make them unsuitable for real-world applications. In this paper we will present a system in which a robotic agent can learn a grounded language model by actively interacting with a human user. The model is grounded in the sense that meaning of the words is linked to a concrete sensorimotor experience of the agent, and linguistic rules are automatically extracted from the interaction data. The system has been tested on the NAO humanoid robot and it has been used to understand and generate appropriate natural language descriptions of real objects. The system is also capable of conducting a verbal interaction with a human partner in potentially ambiguous situations.
2010
Settore ING-INF/05 - Sistemi Di Elaborazione Delle Informazioni
Zambuto, D., Dindo, H., Chella, A. (2010). Visually-Grounded Language Model for Human-Robot Interaction. INTERNATIONAL JOURNAL OF COMPUTATIONAL LINGUISTICS RESEARCH, 1:3, 105-115.
File in questo prodotto:
File Dimensione Formato  
2010_IJCLR_paper.pdf

accesso aperto

Dimensione 2.83 MB
Formato Adobe PDF
2.83 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10447/61881
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? ND
social impact