系统工程与电子技术 ›› 2024, Vol. 46 ›› Issue (9): 3070-3081.doi: 10.12305/j.issn.1001-506X.2024.09.19

• 系统工程 • 上一篇    

基于深度Q网络的无人车侦察路径规划

夏雨奇, 黄炎焱, 陈恰   

  1. 南京理工大学自动化学院, 江苏 南京 210094
  • 收稿日期:2023-09-01 出版日期:2024-08-30 发布日期:2024-09-12
  • 通讯作者: 黄炎焱
  • 作者简介:夏雨奇 (1997—), 男, 博士研究生, 主要研究方向为机器人控制
    黄炎焱 (1973—), 男, 教授, 博士, 主要研究方向为有/无人系统协同规划与控制
    陈恰 (2000—), 男, 硕士研究生, 主要研究方向为系统建模与仿真
  • 基金资助:
    国家自然科学基金(61374186)

Path planning for unmanned vehicle reconnaissance based on deep Q-network

Yuqi XIA, Yanyan HUANG, Qia CHEN   

  1. School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China
  • Received:2023-09-01 Online:2024-08-30 Published:2024-09-12
  • Contact: Yanyan HUANG

摘要:

在城市战场环境下, 无人侦察车有助于指挥部更好地了解目标地区情况, 提升决策准确性, 降低军事行动的威胁。目前, 无人侦察车多采用阿克曼转向结构, 传统算法规划的路径不符合无人侦察车的运动学模型。对此, 将自行车运动模型和深度Q网络相结合, 通过端到端的方式生成无人侦察车的运动轨迹。针对深度Q网络学习速度慢、泛化能力差的问题, 根据神经网络的训练特点提出基于经验分类的深度Q网络, 并提出具有一定泛化能力的状态空间。仿真实验结果表明, 相较于传统路径规划算法, 所提算法规划出的路径更符合无人侦察车的运动轨迹并提升无人侦察车的学习效率和泛化能力。

关键词: 深度强化学习, 无人侦察车, 路径规划, 深度Q网络

Abstract:

In urban battlefield environments, unmanned reconnaissance vehicles help command centers better understand the situation in target areas, enhance decision-making accuracy, and reduce the threat of military operations. At present, unmanned reconnaissance vehicles mostly use Ackermann steering geometry. The path planned by the traditional algorithms does not conform to the kinematic model of the unmanned reconnaissance vehicle. Thus, the combination of bicycle motion model and deep Q-network are proposed to generate the motion trajectory of unmanned reconnaissance vehicles in an end-to-end manner. In order to solve the problems of slow learning speed and poor generalizing of deep Q-network, a deep Q-network based on experience classification according to the training characteristics of neural network and a state space with certain generalization ability are proposed. The simulation experiment results show that compared with the traditional path planning algorithms, the path planned by proposed algorithm is more in line with the movement trajectory of the unmanned reconnaissance vehicle, and which improve the learning efficiency and generalization ability of the unmanned reconnaissance vehicle.

Key words: deep reinforcement learning, unmanned reconnaissance vehicle, path planning, deep Q-network

中图分类号: