10:00 AM - 10:20 AM
[3F1-GS-10-04] Improving rewards for vehicle routing using deep reinforcement learning
Keywords:Reinforcement Learning, Combinatorial Optimization, Routing
This paper proposes a new rewards function to improve vehicle routing using deep reinforcement learning. The previous method uses the difference between the length of the modified route and the length of the previous one as a reward for learning the heuristics of the 2-Opt algorithm for the traveling salesman problem. However, it cannot learn in some crowded areas because the difference is small even if it improves the route. We propose using a square root function to increase the rewards when the difference is small and to reduce the ones when the difference is large. We confirmed the effectiveness of the proposed method using actual routing problems in a logistics company.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.