JSAI2022

Presentation information

Organized Session

Organized Session » OS-15

[3G4-OS-15b] 移動系列のデータマイニングと機械学習(2/2)

Thu. Jun 16, 2022 2:50 PM - 5:10 PM Room G (Room G)

オーガナイザ:藤井 慶輔(名古屋大学)[現地]、竹内 孝(京都大学)、沖 拓弥(東京工業大学)、西田 遼(東北大学)、田部井 靖生(理化学研究所)、前川 卓也(大阪大学)

2:50 PM - 3:10 PM

[3G4-OS-15b-01] Multi-objective Deep Reinforcement Learning for Crowd Guidance Policy Optimization

〇Ryo Nishida1,2, Yuki Tanigaki2, Masaki Onishi2, Koichi Hashimoto1 (1. Tohoku University, 2. AIST)

Keywords:Deep Reinforcement Learning , Multi-objective Optimization, Crowd movement control

The objective of this study is to improve Multi Objective Deep Reinforcement Learning (MODRL) for optimizing crowd guidance strategies. In general, MODRL is classified into Outer-loop method and Inner-loop method. In the former, multiple objective functions are transformed into a single objective using a scalarization function, and the Pareto front, which is the optimal solution set, is obtained by repeatedly updating the weights of the scalarization function and performing single-objective optimization. However, in this method, if the computational cost of single-objective optimization is high, the overall computational cost increases in proportion to the number of times the weights update. On the other hand, the latter the Inner-loop method is designed to learn Pareto front in a learning process. In this study, we examine the approximation of the Pareto solution by different action selection methods of Pareto-DQN, which is a typical method of the Inner-loop method. In the experiments, we evaluate the proposed method using a benchmark problem and finally discuss its application to the optimization of crowd guidance strategies.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password