JSAI2025

Presentation information

General Session

General Session » GS-5 Language media processing

[3G6-GS-6] Language media processing:

Thu. May 29, 2025 5:40 PM - 7:20 PM Room G (Room 1002)

座長:越仲 孝文(横浜市立大学)

5:40 PM - 6:00 PM

[3G6-GS-6-01] A Method of Text Generation with Grammatical Constraints for Large Language Models Considering the Maximum Number of Tokens

〇Yoshio Kato1, Shuhei Tarashima1 (1. NTT Communications)

Keywords:LLM, structured generation, context free grammar

Recently, Large Language Models (LLMs) have been integrated with other systems by producing output that follows a specific format, such as JSON or a programming language. There are several methods to control the output of LLMs so that they always follow a specified grammar. However, existing methods do not take into account the limit of the number of output tokens required for practical use, which leads to ungrammatical output because the generation can be terminated in the middle. We utilize LL(1) parsers and propose a novel method of forcing LLMs to generate grammatically correct output considering the maximum number of tokens. In a benchmark of generating JSON with strict constraints on the maximum number of tokens, our method improves the accuracy by 20 points compared to an existing method, and almost all outputs follow the JSON grammar.

Please log in with your participant account.
» Participant Log In