
Presentation information

Organized Session

Organized Session » OS-2

[3K1-OS-2a] OS-2

Thu. May 30, 2024 9:00 AM - 10:40 AM Room K (Room 44)

オーガナイザ:鈴木 健二(ソニーグループ株式会社)、原 聡(大阪大学)、谷中 瞳(東京大学)、菅原 朔(国立情報学研究所)

10:20 AM - 10:40 AM

[3K1-OS-2a-04] Laws and regulations that are problematic when building, using, and publishing datasets in generative AI development and how to clear them

focusing on copyright law and personal information protection law

〇Taichi Kakinuma1 (1. STORIA LAW OFFICE)


Keywords:Copyright Act, Personal Data Protection Law, Law, DATA SET

In this paper, we discuss legal challenges and solutions when constructing, using, and releasing datasets for generative AI development, particularly under copyright law and personal information protection law. We will examine data collection, categorizing it into text, images, and audio. Under Japanese copyright law, collecting and reproducing works for datasets is generally allowed for both academia and private companies. However, with increasing scrutiny over large-scale web-based text data collection, careful consideration is advised. As for Japan's Personal Information Protection Law, collecting personal data is mainly permissible, barring wrongful acquisition and requiring specified, publicized use. While there are no significant legal barriers to acquiring personal data for AI datasets, special attention is needed when releasing datasets containing sensitive personal information.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.
