JSAI2023

Presentation information

General Session

General Session » GS-5 Language media processing

[4A2-GS-6] Language media processing

Fri. Jun 9, 2023 12:00 PM - 1:40 PM Room A (Main hall)

座長:三田 雅人(サイバーエージェント) [現地]

1:20 PM - 1:40 PM

[4A2-GS-6-05] Style-sensitive Sentence Vectors for Evaluating Similarity in Speech Style by Contrastive Learning

〇Yuki Zenimoto1, Shinzan Komata1, Takehito Utsuro1 (1. University of Tsukuba)

Keywords:Speech Style, Sentence Embedding, Contrastive Learning, Speaker Classification

Since dialogue systems are required to keep its speech style consistency, evaluating the similarity of speech styles is an important task. However, the Japanese language has a wide variety of speech styles, and the vocabulary and word usage characteristics of each speech style is vast, making it difficult to evaluate the speech style. Therefore, we propose a speech style embedding model that generates a style-sensitive vector. The speech style embedding model is constructed by fine-tuning a pre-trained BERT model using contrastive learning. Sentence pairs with similar and different speech styles, which are necessary for contrastive learning, are automatically collected on a large scale using a sequence of sentences in web novels. We also analyze the grouping of speech styles and the characteristic vocabulary and word usage of each speech style using Ward hierarchical clustering method. Finally, we focus on the variation in the speech style of the same person depending on the situation, and analyze the variation in the style-sensitive vectors of the same character in the novel.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password