Cross-Lingual Finetuning in Large Language Models

Jude McCutcheon

2:00 PM - 2:20 PM

[3K4-IS-2a-02] Cross-Lingual Finetuning in Large Language Models

〇Jude McCutcheon¹ (1. Apprhythm Co., Ltd.)

Keywords:LLM, Finetuning, Cross-lingual

Large Language Models (LLMs) have set new benchmarks in various fields, achieving higher task performance with smaller training datasets. This success is largely attributed to the pretrain-finetune paradigm: models are first pretrained on extensive unlabeled corpora to develop general language understanding and then fine-tuned on smaller labeled datasets for specific tasks. While effective for many languages and tasks, this approach remains challenging for lower-resource languages, where labeled task data is scarce. Even Japanese, a higher-resource language, is held back by the relative scarcity of task-specific datasets. However, leveraging the wealth of English-language resources through cross-linguistic training offers a promising solution. This study investigates the cross-linguistic generalization capabilities of LLMs by fine-tuning a monolingual English model and its continually pretrained Japanese counterpart on English task datasets and evaluating them on comparable Japanese tasks. Our findings reveal that much of the task-specific knowledge imparted during fine-tuning transcends language boundaries, positing cross-lingual fine-tuning as a powerful strategy for enhancing LLM performance in lower-resource languages.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3K4-IS-2a] Machine learning

[3K4-IS-2a-02] Cross-Lingual Finetuning in Large Language Models

Password