Abstract:
Cognitive radar countermeasure technology can be exploited by jamming system to make intelligent decision without prior knowledge. Employing existing jamming strategy based on reinforcement learning theory, desirable benefit cannot be obtained in the radar countermeasures environment where real-time response is required, jamming time is limited and radar strategy changes rapidly. Based on multi-armed bandit(MAB) theory, an online intelligent jamming strategy is proposed in this paper using the maximum expected value weighted(MEVW) estimation method and learning-window shifting (LWS) approach, where MEVW can improve the estimation accuracy about maximal benefit arm, and LWS allow jamming to adapt to time-varying environment. Numerical experiments in typical time-varying environments show that the proposed has higher decision benefits and better adaptability than traditional methods.