Analysis of cognitive satisficing value function

Akihiro Tamatsukuri

2:20 PM - 2:40 PM

[1N1-04] Analysis of cognitive satisficing value function

Guaranteed satisficing and finite regret

〇Akihiro Tamatsukuri¹, Tatsuji Takahashi² (1. Graduate School of Tokyo Denki Univerity, 2. Tokyo Denki University)

Keywords:satisficing, bandit problems, cognitively inspired computing

As the domains of reinforcement learning become more complicated and realistic, standard optimization algorithms may not work well. In this paper we introduce a simple mathematical model called RS (reference satisficing) that implements a satisficing strategy that look for actions with values above the aspiration level. We apply it to K-armed bandit problems. If there are actions with values above the aspiration level, we theoretically show that RS is guaranteed to find these actions. Also, if the aspiration level is set to an ''optimal level'' so that satisficing practically ends up optimizing, we prove that the regret (the expected loss) is upper bounded by a finite value. We confirm these results by simulations, and clarify the effectiveness of RS through comparison with other algorithms.

Presentation information

[1N1] [General Session] 2. Machine Learning

[1N1-04] Analysis of cognitive satisficing value function

Guaranteed satisficing and finite regret