12:00 PM - 12:20 PM
[4O2-OS-29a-01] Evaluation of Large Language Models for RAG-Enhanced Learning System of Ancient Egyptian
Keywords:Ancient Egyptian & Coptic, Retrieval-Augmented Generation, Educational Application, Large Language Model, Chatbot
This research presents the development and evaluation of an interactive system integrating Retrieval-Augmented Generation (RAG) and Large Language Models (LLMs) to facilitate the learning of Ancient Egyptian, particularly Middle Egyptian. The study evaluated the translation performance of various LLMs, including Claude 3.5 Sonnet, Gemini 2.0, DeepSeek R1, and multiple GPT models, on Middle Egyptian texts in Latin transliteration. Analysis using evaluation metrics such as BLEU, SacreBLEU, METEOR, and ROUGE revealed that Claude 3.5 Sonnet achieved the highest overall score, followed by Gemini 2.0 Pro Experimental. Based on these results, we developed "THOTH AI," a web-based interactive application with OCR functionality on the Dify platform using Claude 3.5 Sonnet's API, implementing RAG with vectorized Ancient Egyptian vocabulary data, translation pairs, and grammatical information. Through this web application development case study, we explore the potential applications of AI in Ancient Egyptian e-learning and digital humanities.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.