JSAI2018

Presentation information

Oral presentation

Organized Session » [Organized Session] OS-14

[3C1-OS-14a] [Organized Session] OS-14

Thu. Jun 7, 2018 1:50 PM - 3:10 PM Room C (4F Orchid)

2:30 PM - 2:50 PM

[3C1-OS-14a-03] Predicting Important Utterance based on Fusing Verbal and Nonverbal Information

〇Fumio Nihei1, Yukiko Nakano2, Yutaka Takase2 (1. Graduate school of Seikei University, 2. Seikei University)

Keywords:multimodal information, face to face conversation, important utterance

Automatic meeting summarization would reduce the cost of producing minutes during or after a meeting. With the goal of establishing a method for extractive meeting summarization, we propose a multimodal fusion model that identifies the important utterances that should be included in meeting extracts of group discussions. The proposed multimodal model fuses audio, visual, motion, and linguistic unimodal models that are trained by employing a convolutional neural network approach. The performance of the verbal and nonverbal fusion model presented an F-measure of 0.827. We also discuss the characteristics of verbal and nonverbal models and demonstrate that they complement each other.