JSAI2025

Presentation information

General Session

General Session » GS-7 Vision, speech media processing

[4N1-GS-7] Vision, speech media processing:

Fri. May 30, 2025 9:00 AM - 10:40 AM Room N (Room 1009)

座長:早川 大智(東芝)

9:00 AM - 9:20 AM

[4N1-GS-7-01] Accent detection for television speech sound

〇Marina Mikami1, Takuya Matuzaki1 (1. Tokyo University of Science)

Keywords:Accent Dictionary

The purpose of this study is to create an accent dictionary for low-frequency words by automatically detecting the phonetic accent position of words based on TV speech data, in which many low-frequency words such as proper nouns and new words appear.
First, the fo value of each phoneme, mora pronunciation, mora position within a word, and part of speech were extracted as features from the speech data, and a classifier was created using these as input. The model was trained on speech data from Corpus of Spontaneous Japanese (CSJ) and used to predict the accent positions of the nouns that appear in LaboroTVSpeech. The accuracy of the resulting accent dictionary was only 77-86%.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password