[3Win5-13] Robust Offline-to-Online Reinforcement Learning against Perturbation in Joint Torque Signals
Keywords:offline reinforcement learning, online fine-tuning, robustness evaluation, adversarial perturbation
Offline reinforcement learning (RL) enables policy learning from pre-collected datasets without environmental interaction. This approach reduces the cost of data collection and mitigating safety risks in robotic control. However, real-world deployment requires robustness to control failures, which remain challenging due to the lack of exploration during training. To address this issue, we propose an offline-to-online RL method that improves robustness with minimal online fine-tuning. During fine-tuning, perturbations simulating control component failures are applied to joint torque signals, including random and adversarial perturbations. We conduct experiments using legged robot models in OpenAI Gym. The results demonstrate that offline RL does not improve robustness and remains highly vulnerable to perturbations. In contrast, our method significantly improves robustness against these perturbations.
Authentication for paper PDF access
A password is required to view paper PDFs. If you are a registered participant, please log on the site from Participant Log In.
You could view the PDF with entering the PDF viewing password bellow.