系统工程与电子技术 ›› 2021, Vol. 43 ›› Issue (6): 1547-1556.doi: 10.12305/j.issn.1001-506X.2021.06.12

• 雷达抗干扰技术 • 上一篇    下一篇

基于核函数强化学习的抗干扰频点分配

江志炜1, 黄洋1,2, 吴启晖1,*   

  1. 1. 南京航空航天大学电磁频谱空间认知动态系统工信部重点实验室, 江苏 南京 211106
    2. 东南大学移动通信国家重点实验室, 江苏 南京 211189
  • 收稿日期:2020-12-28 出版日期:2021-05-21 发布日期:2021-05-28
  • 通讯作者: 吴启晖
  • 作者简介:江志炜(1995—), 男, 硕士研究生, 主要研究方向为无线资源分配、强化学习|黄洋(1989—), 男, 副教授, 硕士研究生导师, 博士,主要研究方向为无线通信、MIMO系统、凸优化、机器学习、通信信号处理|吴启晖(1970—), 男, 教授, 博士研究生导师, 主要研究方向为认知无线网络算法和优化、软件无线电、无线通信
  • 基金资助:
    国家自然科学基金(61827801);国家自然科学基金(61631020);国家自然科学基金(61901216);江苏省自然科学基金(BK20190400);东南大学移动通信国家重点实验室开放研究基金(2020D08)

Anti-interference frequency allocation based on kernel reinforcement learning

Zhiwei JIANG1, Yang HUANG1,2, Qihui WU1,*   

  1. 1. Key Laboratory of Ministry of Industry and Information Technology on Electromagnetic Spectrum Spatial Cognitive Dynamic Systems, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China
    2. National Mobile Communications Research Laboratory, Southeast University, Nanjing 211189, China
  • Received:2020-12-28 Online:2021-05-21 Published:2021-05-28
  • Contact: Qihui WU

摘要:

针对学习未知动态的干扰图样问题,提出一种基于核函数强化学习的雷达与通信抗干扰频点协作算法。与需要获得干扰模式、参数等先验知识的研究相反,所提算法能够利用过去时隙中频点的使用情况来优化抗干扰频点分配策略。首先,通过核函数的强化学习来应对维度诅咒问题。其次,基于近似线性相关性的在线内核稀疏化方法,确保了抗干扰频点分配算法的稀疏性。最后,仿真结果验证了所提算法的有效性。得益于稀疏化码字对于系统动态特性的学习,所提算法与传统基于Q学习的抗干扰频点分配算法相比,收敛时间更短,并且可以快速规避外部未知干扰源的干扰。

关键词: 抗干扰, 强化学习, 核方法, Q学习

Abstract:

Aiming at the problem of learning unknown dynamic interference patterns, a frequency point cooperation algorithm for radar and communication anti-interference based on kernel function reinforcement learning is proposed. On the contrary, the proposed algorithm can optimize the anti-interference frequency allocation strategy by using the usage of the intermediate frequency points in the past slots. Firstly, the problem of curse of dimensions is solved by reinforcement learning of kernel function. Secondly, the online kernel sparsity method based on approximate linear correlation ensures the sparsity of anti-interference frequency assignment algorithm. Finally, simulation results verify the effectiveness of the proposed algorithm. Due to the learning of sparse codewords for the dynamic characteristics of the system, compared with the traditional anti-interference frequency assignment algorithm based on Q-learning, the proposed algorithm has shorter convergence time and can quickly avoid the interference of external unknown interference sources.

Key words: anti-interference, reinforcement learning, kernel method, Q-learning

中图分类号: