JSAI2022

Presentation information

General Session

General Session » GS-2 Machine learning

[2C4-GS-2] Machine learning: reinforcement learning (1)

Wed. Jun 15, 2022 1:20 PM - 3:00 PM Room C (Room C-2)

座長:谷本 啓(NEC)[現地]

2:20 PM - 2:40 PM

[2C4-GS-2-04] Dynamic estimation of optimal aspiration level in Stochastic Risk-sensitive Satisificing

〇Jun Kume1, Hiroki Suzuki1, Toshikatsu Kato2, Yu Kono1, Tatsuji Takahashi1 (1. School of Science and Engineering, Tokyo Denki University, 2. Graduate School of Tokyo Denki University)

Keywords:Reinforcement Learning, Machine Learning, Bandit Problem, Satisficing

Artificial intelligence technology has historically been developed by imitating certain aspects of neurophysiological and cognitive properties. In fact, although humans are apparently irrational, they are able to perform quick and congruent search under limited information. We believe that cognitive satisficing is involved in this quick search, and have developed an algorithm, Risk-sensitive Satisficing (RS), which can be applied to search in unknown environments such as the setting of reinforcement learning. Since RS is a deterministic search, it has difficulties in robustness to environmental noise and application to algorithms using probability distributions. To cope with these difficulties, Stochastic Risk-sensitive Satisficing (SRS), which expresses the search ratio inherent in RS as a probability distribution, was devised. However, it is debatable whether SRS retains the excellent characteristics that RS had in many cases. In this study, we examined the definition of congruence, which is one of the tasks of satisficing strategies, in short, the dynamic estimation of the optimal aspiration reward level can be performed in SRS for the bandit problem, and showed that it is possible to achieve both a quick search for congruent means and optimization.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Password