JSAI2019

Presentation information

Organized Session

Organized Session » [OS] OS-11

[4F3-OS-11b] 社会的信号処理とAI(2)

Fri. Jun 7, 2019 2:00 PM - 3:00 PM Room F (302B Medium meeting room)

岡田 将吾(北陸先端科学技術大学院大学)、石井 亮(NTT)

2:20 PM - 2:40 PM

[4F3-OS-11b-02] Identifying Discourse Boundaries in Group Discussions using Multimodal Features

Ken Tomiyama1, 〇Fumio Nihei1, Yutaka Takase2, Yukiko Nakano2 (1. Graduate School of Science and Technology, Seikei University, 2. Faculty of Science and Technology, Seikei University)

Keywords:conversation segmentation, multimodal, group discussion

This study proposes models for detecting conversation boundaries in group discussions. First, we created a multimodal embedding space using an autoencoder, and applied a similarity-based approach to detect the discussion boundary. As the second method, we annotated conversation boundaries and created unimodal CNN models for language, audio, and head motion information. Then, created multimodal models by concatenating the output of unimodal models. In the evaluation experiment, we found that language information was the most useful modality, but by combining with audio and head motion modalities, the CNN-based models more accurately predict the conversation boundaries.