1:20 PM - 1:40 PM
[4H2-OS-6a-05] Effectiveness of Joint Attention in Deep Learning for Generating Language Describing Actions
Keywords:Joint attention, Word meaning acquisition
Joint attention is said to play an important role in human language learning. Recently, research has been conducted on the use of joint attention for language understanding in artificial intelligence. However, previous studies only show the effectiveness of joint attention in mapping words to objects in images without motion, and the use of joint attention in mapping sentences to the actions of objects in image sequences (videos) has not been investigated. In this study, we designed a task that takes an image sequence depicting agents moving on a 2-D board and generates natural language sentences representing the subject and its actions. We propose a deep learning method that uses the trainer's joint attention for this task. Experimental results using synthetic joint attention show the accuracy was significantly improved when joint attention was used during training and testing, while it was not improved when joint attention was used only during training.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.