A Learning Method for Individual Device Control of Sorting Machine Using Deep Reinforcement Learning

Yoshiaki Nakamura

1:50 PM - 2:10 PM

[2J4-GS-2-01] A Learning Method for Individual Device Control of Sorting Machine Using Deep Reinforcement Learning

〇Yoshiaki Nakamura¹, Shiro Takata^1,2, Satoshi Iwasada¹, Naohiro Shioji¹ (1. Bee Co., Ltd., 2. Department of Science & Engineering, KIndai Univ.)

Keywords:deep reinforcement learning, combinatorial optimization, Q-learning

The purpose of this paper is to propose a learning method for the control of a "Sorting Machine", which divides a product that does not depend on type into multiple devices accurately according to a certain standard. In the deep Q-learning of this paper, a reward function is set such that the smaller the difference between the measured value of the product weight and the target value (called weighing error), the greater the immediate reward. Then, using the DQN (Deep Q Network) for estimating the state action value Q value, the device corresponding to the smallest (opposite to normal) Q value output from the DQN is set as the target of the action selection. The selected device can be determined to have the largest cumulative weighing error, and an operation is performed to reduce the calculation error with this device as the control target. By repeating such deep reinforcement learning, the weighing error of all devices can be reduced, and products can be accurately sorted according to a certain standard. This paper presents the learning method and simulation results.

Authentication for paper PDF access

A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.

Presentation information

[2J4-GS-2] Machine learning: Deep reinforcement learning

[2J4-GS-2-01] A Learning Method for Individual Device Control of Sorting Machine Using Deep Reinforcement Learning

Password