9:40 AM - 10:00 AM
[3A1-GS-10-03] Research on Agent Evaluation Methods in LLM-based Multi-Agent Systems
Keywords:AI Agent, Large Language Models, Multi Agent, LLM Agent, Human-AI Interaction
Multi-agent systems leveraging Large Language Models (LLMs) represent a paradigm in which multiple AI agents collaborate or compete to accomplish complex tasks. These systems have been explored for a wide range of applications, including improving the accuracy of question-answering, simulating real-world interactions, and enhancing the efficiency of software development. However, methods for evaluating the effectiveness of individual agents within multi-agent systems remain underexplored. In this study, we utilized an LLM-based multi-agent system we developed, called Nomatica, to investigate and evaluate methods for assessing the effectiveness of individual agents during tasks such as free discussions, ideation, and review sessions. The evaluation particularly focused on the utility of RAG (Retrieval Augmented Generation) agents, which leverage RAG technology. The results demonstrated that utilizing RAG agents and task-specific agents contributes to improving the overall system performance, providing valuable insights for the development of agent evaluation methodologies.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.