Analysis of time discount in photonic reinforcement learning

Honoka Shiratori; Takashi Urushibara; Nicolas Chauvet; Satoshi Sunada; Kazutaka Kanno; Atsushi Uchida; Ryoichi Horisaki; Makoto Naruse

1:30 PM - 1:45 PM

[12p-S101-2] Analysis of time discount in photonic reinforcement learning

Honoka Shiratori¹, 〇Takashi Urushibara², Nicolas Chauvet^1,2, Satoshi Sunada³, Kazutaka Kanno⁴, Atsushi Uchida⁴, Ryoichi Horisaki^1,2, Makoto Naruse^1,2 (1.Faculty of Eng., Univ. Tokyo,, 2.Grad. School of Info. Sci. & Technol., Univ., 3.Kanazawa Univ., 4.Saitama Univ.)

Keywords:Laser chaos, Reinforcement learning, Time discount

In the proceeding study, we proposed a method for multi-state reinforcement learning utilizing chaotic laser time series and demonstrated the fastness and smartness of our method, comparing it with a conventional method of Q-learning in the Cart-pole balancing situation. At that time, we used exponential time discount for the penalty of failure of the Cart-Pole. In this study, we explore other functional types for time discount and find that the type of function is not so important and what is more critical is the timing for the discount value to converge to 0, in terms of fastness of learning.

Presentation information

[12p-S101-1~15] FS.1 Focused Session "AI Electronics"

[12p-S101-2] Analysis of time discount in photonic reinforcement learning