[4Xin2-29] Multilingual Comparison of Mathematical Reasoning in Large Language Models
Keywords:LLM, MATH, multilingual comparison
The mathematical reasoning capabilities of Large Language Models (LLMs) are not highly effective when relying solely on natural language, but performance improves when integrating code generation or external tools. It is anticipated that enhancing the mathematical reasoning abilities through natural language alone would boost overall performance, yet methods for achieving this remain unestablished. Although primarily focused on English, LLMs have started to show improved performance in tasks when utilized in other languages, suggesting the potential for enhanced mathematical reasoning capabilities, though this remains unconfirmed. This study aims to elucidate the impact of language on the mathematical reasoning abilities of LLMs. Utilizing the prominent LLM, GPT, this research conducts mathematical tasks in five different languages and compares their accuracy. Surprisingly, some tasks performed in languages other than English demonstrated higher levels of success. This study provides insightful revelations regarding the mathematical reasoning capabilities of LLMs.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.