JSAI2025

Presentation information

General Session

General Session » GS-11 AI and Society

[4I2-GS-11] AI and Society:

Fri. May 30, 2025 12:00 PM - 1:40 PM Room I (Room 1004)

座長:廣中 詩織(京都大学)

12:40 PM - 1:00 PM

[4I2-GS-11-03] Addressing Prompt Injection through Iterative Interactions among LLM Agents

〇Go Sato1, Ryohei Orihara1, Yasuyuki Tahara1, Akihiko Ohsuga1, Yuichi Sei1 (1. The University of Electro-Communications)

Keywords:Large Language Model, Multi Agent, AI Ethics

In recent years, as the demand for Large Language Models (LLMs) has increased, prompt injection attacks have become a serious security concern. Numerous studies have been conducted to resolve this problem. However, the lack of datasets and the growing variety of attack methods have led to a decline in generalizability. To address this issue, in this study, we construct two teams of multiple LLM agents: one for generating prompts that induce prompt injection and another for evaluating their harmfulness. Through iterative prompt generation and evaluation between these teams, we aim to develop countermeasures against a diverse range of attacks. As a result, our approach demonstrated higher accuracy in evaluating prompt harmfulness compared to the baseline model.

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password