Application of Large Language Models in Capture The Flag

Tatsuya Satake

12:00 PM - 12:20 PM

[4N2-GS-10-01] Application of Large Language Models in Capture The Flag

〇Tatsuya Satake¹, Akira Otsuka¹ (1. Institute of Information Security)

Keywords:Large Language Model, ChatGPT, Cyber Security, Capture The Flag, Autonomous Cyber Reasoning System

In recent years, large-scale language models (LLMs) such as GPT-3 have achieved remarkable success in natural language processing. This success is attributed to their ability to handle a wide variety of natural language processing tasks by having a huge number of parameters and integrating large amounts of textual data through autoregressive learning. In addition, recent research has been conducted on the application of LLMs to difficult exams such as the bar exam and medical licensing exam, which produce results that outperform the traditional human average, and application of LLM is expected in fields that require advanced knowledge. We expect that these LLMs could also be applied to the cybersecurity field. This study aims to explore the potential of autonomous cyber reasoning systems through the use of LLMs in Capture The Flag (CTF). Specifically, we experimented with the ChatGPT developed by OpenAI to solve the picoCTF2022 problem with three unique metrics: Zero-round, Few-rounds, and Failure. As a result, the ChatGPT succeeded in obtaining 48 flags out of a total of 64 questions, ranking 575th out of 7,794 total participants.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[4N2-GS-10] AI application

[4N2-GS-10-01] Application of Large Language Models in Capture The Flag

Password