Abstract:
The reliability and accuracy of algorithms in autonomous driving are particularly important. The prerequisite for completing various tasks is the ability to accurately obtain and describe target information under various lighting and weather conditions. Considering the limitations of single sensor data acquisition, a feature fusion method is proposed based on channel attention and cross-attention mechanisms to perform feature-level fusion of 4D millimeter wave radar point cloud data and camera images. First, the data collected by the millimeter-wave radar and camera are spatially aligned through the method of coordinate system conversion. Then, the spatially aligned point cloud data is projected into pseudo-images, and the point clouds in these pseudo-images are expanded to improve fusion matching. Finally, a proposed cross-attention feature fusion network is used to merge the extracted camera and point cloud features, enhancing the key features in the camera. Experimental results demonstrate that the proposed fusion structure effectively integrates features from camera images and pseudo-images, thereby improving the network’s detection performance for various targets.