Abstract:
In response to the issues of insufficient view representation capability and information loss during feature fusion in traditional multi-view 3D model retrieval, this paper proposes a 3D model retrieval method based on depth images and a dual attention mechanism. Specifically: (1) A depth view generation algorithm for 3D models is designed, which constructs geometry-aware depth images through model normalization, spatial transformation matrices, and depth channel feature mapping. (2) A dual attention feature extraction network is developed, where the channel attention module adaptively weights the contribution of features from different views to achieve differentiated feature fusion, and the Transformer attention module establishes correlation mappings across cross-view feature subspaces to uncover latent relationships between views. Experiments on the ModelNet40 dataset show that the proposed method achieves ACC and MAP scores of 96.57% and 94.35%, respectively. Compared to MVCNN
7, GVCNN
8, and MVDAN
13, it improves ACC by 0.3% to 4.4% and MAP by 0.1% to 8.5%.