Acquiring Generalizable Reasoning Ability through a Large-Scale Logical Corpus with Inductively Diverse Examples

Terufumi Morishita; Atsuki Yamaguchi; Gaku Morio; Osamu Imaichi; Yasuhiro Sogawa

[3Xin2-64] Acquiring Generalizable Reasoning Ability through a Large-Scale Logical Corpus with Inductively Diverse Examples

〇Terufumi Morishita¹, Atsuki Yamaguchi², Gaku Morio¹, Osamu Imaichi¹, Yasuhiro Sogawa¹ (1.Research & Development Group, Hitachi, Ltd., 2.The University of Sheffield)

Keywords:language model, benchmark, logical reasoning, generative AI, corpus

We propose FLD^x² (Formal Logic Deduction Diverse), a large-scale synthetic corpus designed to enhance the generalizable logical reasoning ability of large language models (LLMs). Previous studies on synthetic corpora lacked both a foundational principle of corpus design for generalizability and comprehensive empirical validation of that generalizability. To address the issues, we first discuss the corpus design from a principle that we propose, namely inductive diversity, which states that a corpus has to include samples covering exhaustive patterns of reasoning for both positive/negative cases of reasoning. This makes LLMs to accurately infer the rules behind the samples inductively. Then, on the basis of this principle, we construct FLD^x², comprising inductively diverse 300k reasoning samples. Finally, we evaluate LLMs trained on FLD^x² using sixteen benchmarks covering various tasks. The results show large performance gains on many tasks. Further, the results suggest that the LLMs can integrate their originally acquired knowledge with newly gained reasoning abilities. We release the code and corpus.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3Xin2] Poster session 1

[3Xin2-64] Acquiring Generalizable Reasoning Ability through a Large-Scale Logical Corpus with Inductively Diverse Examples

Password