[2Win5-93] Exploring LLM-based methods to identify harmful content without directly viewing
Keywords:Content Moderation, LLM, Safety, Blockchain, Harmful Content
Recognizing and promptly addressing harmful content on online services is a critical mission for content moderators and cybersecurity professionals to protect users. However, it is known that content requiring direct viewing for assessment, such as images and videos, can place a significant burden on viewers. Therefore, the challenge lies in how to deal with harmful content while protecting not only users but also those involved in content monitoring. This study focuses on the linguistic interpretation of content using Large Language Models (LLMs) and evaluates the effectiveness of applying a text-based judgment mechanism to assess the harmfulness of content that is difficult to view directly. We use Bitcoin, a well-known blockchain service with publicly available and verifiable data, known to contain harmful content, as our target. By interpreting image data embedded on the blockchain through LLM-generated textual descriptions and comparing the results with research that investigated the embedded data, we demonstrate the effectiveness of our proposed mechanism.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.