Adaptability of Cognitive Satisficing Algorithm in Nonstationary Environments

Yuto Hanayasu

2:00 PM - 2:20 PM

[2H3-J-2-03] Adaptability of Cognitive Satisficing Algorithm in Nonstationary Environments

〇Yuto Hanayasu¹, Kenshi Saito², Yuki Yoshii¹, Yu Kono¹, Tatsuji Takahashi¹ (1. School of Science and Engineering, Tokyo Denki University, 2. Graduate School of Tokyo Denki University)

Keywords:multi-armed bandit, satisficing, nonstationary

The environments where an agent performs trial-and-error learning is generally nonstationary because of unobservable information and various kinds of fluctuations. In order to make effective decisions in such an environment, the agent has to gradually or abruptly discard old information and put more weight on newer information, because some of the elements in the environment may have changed. As a result, there is a necessity of choosing a better option with smaller amount of information. We focus on the risk-sensitive satisficing (RS) algorithm which models the decision-making strategy of human beings and animals. We compare its performance in stationary and nonstationary bandit problems with other representative algorithms. We propose variants of RS combined with existing ideas for adaptation for nonstationary bandits such as meta-bandit and discounted update.

Presentation information

[2H3-J-2] Machine learning: selective preprocess

[2H3-J-2-03] Adaptability of Cognitive Satisficing Algorithm in Nonstationary Environments