JSAI2019

Presentation information

General Session

General Session » [GS] J-2 Machine learning

[3K4-J-2] Machine learning: real world interaction

Thu. Jun 6, 2019 3:50 PM - 5:30 PM Room K (201A Medium meeting room)

Chair:Daiki Kimura Reviewer:Hikaru Kajino

5:10 PM - 5:30 PM

[3K4-J-2-05] Linear function approximation of Cognitive Satiscing Function

To Cope with Contextual-bandit Problem

〇Yu Kono1,2 (1. Tokyo Denki University, 2. DeNA, Co., Ltd.)

Keywords:Reinforcement Learning, Contextual-bandit , Decision-making

Both Recommendation and foraging behavior of animals are aiming to maximizing rewards through trial and error. By contrast, Maximizing reward is difficult in a complex actual world that is extremely complicated. So, The decision-making agents is considered to give priority to whether or not to achieve a specific purpose. In addition, they aim to achieve the desire level with as little information as possible. The decision-making tendency where is owned intelligent lives is called "satisficing". The RS algorithm to make choices for "satisficing" was focused in this paper, further LinRS adapted to linear approximation function was devised so that the scope of the problem is expanded to be more widely applicable. In consequence, RS became enabled to cope with the contextual-bandit problem where has application examples such as advertisement delivery. Moreover LinRS compared with familiar existing selection algorithms in simulation. The linear function approximation of LinRS realized in this study is the first step to apply a fast and efficient search algorithm by using RS that emphasizes achievement of purpose to deep reinforcement learning.