Creating Japanese VIrtue Dataset for AI Safety

Masashi Takeshita

10:00 AM - 10:20 AM

[3G1-GS-11-04] Creating Japanese VIrtue Dataset for AI Safety

〇Masashi Takeshita¹, Rafal Rzepka¹, Kenji Araki¹ (1. Graduate School of Information Science and Technology, Hokkaido University)

Keywords:Natural Language Processing, AI Safety, VIrtue Ethics, AI Alignment

Some AI models, such as large language models (LLMs), are known to generate harmful content for humans. AI researchers conduct AI alignment research to ensure that AI models understand our ethics and behave appropriately. However, most of these studies are conducted in English, with few studies in Japanese. Thus, this study creates a dataset for AI safety based on virtue ethics, a major stance in normative ethics. We create a new dataset in Japanese using the same construction method as that used to create the existing English virtue ethics dataset. The created dataset consists of approximately 20,000 cases, and we evaluate whether the AI model can correctly classify the correspondence between sentences describing an action and the character trait terms describing that action. We experimented with existing Japanese LLMs and found that it is difficult for these models to classify the correspondence correctly. We also compared our dataset with an existing English virtue ethics dataset.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3G1-GS-11] AI and Society:

[3G1-GS-11-04] Creating Japanese VIrtue Dataset for AI Safety

Password