(OS invited talk) Towards construction and use of safe large language models

NAOAKI OKAZAKI

Presentation information

Organized Session

Organized Session » OS-42

[3F5-OS-42b] OS-42

Thu. May 29, 2025 3:40 PM - 5:20 PM Room F (Room 1001)

オーガナイザ：金子正弘（MBZUAI），小島武（東京大学），磯沼大（The University of Edinburgh／東京大学），丹羽彩奈（MBZUAI），大葉大輔（ELYZA／東京科学大学），村上明子（AIセーフティーインスティチュート），関根聡（情報学研究所），内山将夫（情報通信研究機構），Danushka Bollegala（The University of Liverpool／Amazon）

4:40 PM - 5:20 PM

[3F5-OS-42b-04] (OS invited talk) Towards construction and use of safe large language models

〇NAOAKI OKAZAKI¹, Masahiro Kaneko² (1. Institute of Science Tokyo, 2. MBZUAI)

Keywords:Large Language Model, safety, bias

本講演では、合成データに基づく指示チューニングによる安全性の強化など、大規模言語モデル（LLM）の構築における安全性への取り組みに加えて、LLMの（言語横断的な）バイアス測定、自己改善によるバイアス除去、メンバーシップ推論攻撃とその回避策、LLM検出（LLMによって生成されたテキストかを識別すること）の頑健性向上など、LLMの利用における安全性に関する研究を紹介します。

Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[3F5-OS-42b] OS-42

[3F5-OS-42b-04] (OS invited talk) Towards construction and use of safe large language models

Password