JSAI2018

Presentation information

Oral presentation

General Session » [General Session] 2. Machine Learning

[1N1] [General Session] 2. Machine Learning

Tue. Jun 5, 2018 1:20 PM - 3:00 PM Room N (2F Sakurajima)

座長:原 聡(大阪大学)

2:20 PM - 2:40 PM

[1N1-04] Analysis of cognitive satisficing value function

Guaranteed satisficing and finite regret

〇Akihiro Tamatsukuri1, Tatsuji Takahashi2 (1. Graduate School of Tokyo Denki Univerity, 2. Tokyo Denki University)

Keywords:satisficing, bandit problems, cognitively inspired computing

As the domains of reinforcement learning become more complicated and realistic, standard optimization algorithms may not work well. In this paper we introduce a simple mathematical model called RS (reference satisficing) that implements a satisficing strategy that look for actions with values above the aspiration level. We apply it to K-armed bandit problems. If there are actions with values above the aspiration level, we theoretically show that RS is guaranteed to find these actions. Also, if the aspiration level is set to an ''optimal level'' so that satisficing practically ends up optimizing, we prove that the regret (the expected loss) is upper bounded by a finite value. We confirm these results by simulations, and clarify the effectiveness of RS through comparison with other algorithms.