船舶 ›› 2022, Vol. 33 ›› Issue (06): 55-62.DOI: 10.19423/j.cnki.31-1561/u.2022.06.055

• 总体与结构 • 上一篇    下一篇

基于Q学习的参数自适应S面控制方法

张佩1, 陆莫凡1, 陈成1,2,*, 徐昊1, 苗川1   

  1. 1.中国船舶及海洋工程设计研究院 上海 200011;
    2.哈尔滨工程大学 船舶工程学院 哈尔滨 150001
  • 收稿日期:2022-03-07 修回日期:2022-06-26 出版日期:2022-12-25 发布日期:2022-12-21
  • 通讯作者: 陈成(1994-),男,博士研究生。研究方向:舰船装备体系研究、舰船总体与系统工程、智能船舶。徐昊(1992-),男,博士,工程师。研究方向:无人装备集成设计、无人装备自主控制。苗 川(1983-),男,本科,工程师。研究方向:船舶与海洋工程。
  • 作者简介:张 佩(1993-),女,硕士,助理工程师。研究方向:无人装备自主控制、无人装备集群控制。陆莫凡(1989-),男,本科,工程师。研究方向:船舶与海洋工程。

Parameter Adaptive S-Plane Control Algorithm Based on Q-Learning

ZHANG Pei1, LU Mofan1, CHEN Cheng1,2,*, XU Hao1, MIAO Chuan1   

  1. 1. Marine Design & Research Institute of China, Shanghai 200011, China;
    2. College of Shipbuilding Engineering, Harbin Engineering University, Harbin 150001, China
  • Received:2022-03-07 Revised:2022-06-26 Online:2022-12-25 Published:2022-12-21

摘要: 复杂的动力学特性及多变的海洋环境对无人水下航行器(UUV)控制器的设计提出了巨大挑战,在实际应用中,控制器的参数经人工调试后便固化,在控制过程中无法适应环境的变化。针对上述难题,该文借鉴自适应控制思想,提出一种基于强化学习的参数自适应S面控制方法,采用自适应控制方式实现不同环境下控制器参数的优化和自动整定。该方法采用Q学习算法进行训练,通过Q学习的自学习机制寻找输入状态和输出动作间的最优映射。仿真试验表明,所提方法能对控制器的参数进行实时在线调整,具备良好的控制效果和环境自适应能力。

关键词: 无人水下航行器, 强化学习, S面控制, 参数自适应

Abstract: The complex dynamic characteristics and changeable marine environment pose great challenges to the design of the unmanned underwater vehicle (UUV) controller. In practical application, the parameter of the controller will be fixed after frequent manual debugging, which is unable to adapt to the changes of the environment during the control. In view of the above problems, a parameter adaptive S-plane control method based on the reinforcement learning is proposed referring to the adaptive control idea. The adaptive control method is used to optimize and automatically tune the controller parameter under different environments. This method is trained by Q-learning algorithm. The optimal mapping between the input status and output action is sought through the self-learning mechanism of Q-learning. Simulation results show that the proposed method can adjust the controller parameter in real time with excellent control effect and environment adaptability.

Key words: unmanned underwater vehicle (UUV), reinforcement learning, S-plane control, parameter adaptive

中图分类号: