JSAI2025

Presentation information

International Session

International Session » IS-2 Machine learning

[4K3-IS-2f] Machine learning

Fri. May 30, 2025 2:00 PM - 3:20 PM Room K (Room 1006)

Chair: Nattawut Kertkeidkachorn

3:00 PM - 3:20 PM

[4K3-IS-2f-04] Learnability of Regular Languages in Language Models

〇Masaya Taniguchi1, Naoki Negishi2, Yusaku Nishimiya4,1, Keisuke Sakaguchi2, Kentaro Inui3,2,1 (1. RIKEN, 2. Tohoku University, 3. Mohamed bin Zayed University of Artificial Intelligence, 4. University of Illinois Springfield)

Keywords:Formal Language, Learnability, Language Acquisition

This study explores the impact of the presentation order of positive and negative data on grammar acquisition in language models. We specifically focus on a text search problem, with the target grammar represented by a regular language. To conduct the study, we prepare two types of data: positive data, where sentences conforming to the target grammar are embedded within the text, and negative data, where such sentences are absent. Our findings demonstrate that both the sampling strategy for positive and negative data and the order in which these datasets are presented influence the language model's ability to acquire grammatical structures.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password