基于Q-Learning的雷达智能抗干扰动态决策方法研究

邵正途; 陈鹏; 葛代河; 许登荣; 周艳

doi:10.16592/j.cnki.1004-7859.2025268

摘要: 在现代电子战环境中，雷达面临的干扰样式愈发复杂多样，并呈现出高度的动态性，传统依赖人工经验的抗干扰决策难以有效应对。针对雷达动态自适应抗干扰决策需求，本文提出基于强化学习Q- Learning 的雷达智能抗干扰动态决策方法。该抗干扰决策方法以“降低干扰威胁等级、快速收敛至低威胁状态”为目标，首先依据抗干扰效益矩阵，量化雷达干扰威胁等级并构建干扰状态转移规则，进而设计以干扰威胁等级降低为核心导向的奖励函数，最后在Q-Learning 框架下实现抗干扰策略的自主学习与动态优选。仿真实验表明，该方法在 10 种干扰状态与 9 种抗干扰措施的场景中可实现稳定收敛，10 种初始状态下总体成功率达 92% 以上，高威胁状态到目标状态的平均转移步数控制在 8 步以内，在转移效率与决策成功率上显著优于固定策略和随机策略，验证了方法的有效性与优越性，为雷达抗干扰决策的智能化提供了有效技术路径。

Abstract: In the modern electronic warfare environment, the jamming styles faced by radars have become increasingly complex and diverse, showing a high degree of dynamics, and traditional anti-jamming decisions relying on manual experience are difficult to deal with effectively. Aiming at the requirement of radar dynamic adaptive anti-jamming decision-making, this paper proposes a radar intelligent anti-jamming dynamic decision-making method based on reinforcement learning Q-Learning. With the goal of “reducing the jamming threat level and quickly converging to low-threat state”, this anti-jamming decision-making method first quantifies the radar jamming threat level and constructs the jamming state transition rules based on the anti-jamming benefit matrix, then designs a reward function with the reduction of jamming threat level as the core orientation, and finally realizes the autonomous learning and dynamic optimization of anti-jamming strategies under the Q-Learning framework. Simulation experiments show that this method can achieve stable convergence in the scenario of 10 jamming states and 9 anti-jamming measures. The overall success rate under 10 initial states reaches more than 92%, and the average number of transfer steps from high-threat states to the target state is controlled within 8 steps. It is significantly superior to fixed strategies and random strategies in transfer efficiency and decision-making success rate, which verifies the effectiveness and superiority of the method and provides an effective technical path for the intellectualization of radar anti-jamming decision-making.