4:10 PM - 4:30 PM
[2I5-GS-2-02] Function approximation of Cognitive Satisficing Value Function
Keywords:reinforcement learning, contextual bandit problem, decision making
Humans have a tendency in decision-making called satisficing: they stop exploring more when they find an option above a criterion (aspiration level). Risk-sensitive Satisficing (RS) model is a value function that enables efficient non-random exploration and realizes satisficing in reinforcement learning (Tamatsukuri & Takahashi, 2019). To apply RS to continuous state spaces, we extended RS to Linear RS (LinRS) for function approximation and test its performance in the contextual bandit problems. As a result, it was found that the algorithm had better performance in probabilistic environments than the existing algorithms. Also, it was found that the aspiration level needed to be corrected because of the approximation error.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.