Presentation information

General Session

General Session » GS-3 Knowledge utilization and sharing

[4N1-GS-3] Knowledge utilization and sharing: social applications

Fri. Jun 17, 2022 10:00 AM - 11:40 AM Room N (Room 501)

座長:市川 淳(静岡大学)[現地]

10:40 AM - 11:00 AM

[4N1-GS-3-03] A Large Scale Web-Based Study of Japanese Vocabulary Size Estimation Test

Based on Word Familiarity Database, Reiwa edition

〇Sanae Fujita1, Tessei Kobayashi1 (1. NTT)


Keywords:word familiarity, vocabulary size estimation, usage analysis

We investigated word familiarity and constructed a Word Familiarity Database Reiwa edition, which consists of about 163,000 words. By selecting test words based on word familiarity, we can estimate the approximate number of vocabulary, simply by asking people to indicate whether or not they know a small number of words. Then, we created a vocabulary-size estimation test based on the Word Familiarity Database Reiwa edition, and have made it available on the Web since June 4, 2020. Nearly two years have passed since its release, and the total number of users has exceeded 70,000. In this paper, we introduce a method for selecting test words and propose a new method for vocabulary-size estimation. In addition, we analyze the results of vocabulary-size estimation using Web logs. In particular, we show how the vocabulary-size changes with age and how the released three tests differ.

