基于深度图像与双重注意力机制的三维模型检索方法

陈建峰; 杨涛; 梁彪; 徐祥

doi:10.16592/j.cnki.1004-7859.2025088

基于深度图像与双重注意力机制的三维模型检索方法

摘要

摘要: 针对传统多视图三维模型检索中存在的视图表征能力不足、特征融合信息丢失等问题，提出了一种基于深度图像与双重注意力机制的三维模型检索方法。具体而言：（1）设计了三维模型深度视图生成算法，通过模型归一化、空间变换矩阵和深度通道特征映射构建具有几何感知的深度图像；（2）构建双重注意力特征提取网络，通过通道注意力模块自适应地加权不同视图的特征贡献度，实现视图特征的差异化融合；通过Transformer 注意力模块建立跨视角特征子空间的相关性映射，挖掘视角间的潜在关联；在ModelNet40数据集上的实验表明，本文方法在ACC和MAP的指标分别为96.57%、94.35%，与MVCNN⁷、GVCNN⁸、MVDAN¹³方法相比，在ACC上提升0.3%~4.4%，在MAP上提升0.1%~8.5%。

Abstract: In response to the issues of insufficient view representation capability and information loss during feature fusion in traditional multi-view 3D model retrieval, this paper proposes a 3D model retrieval method based on depth images and a dual attention mechanism. Specifically: (1) A depth view generation algorithm for 3D models is designed, which constructs geometry-aware depth images through model normalization, spatial transformation matrices, and depth channel feature mapping. (2) A dual attention feature extraction network is developed, where the channel attention module adaptively weights the contribution of features from different views to achieve differentiated feature fusion, and the Transformer attention module establishes correlation mappings across cross-view feature subspaces to uncover latent relationships between views. Experiments on the ModelNet40 dataset show that the proposed method achieves ACC and MAP scores of 96.57% and 94.35%, respectively. Compared to MVCNN⁷, GVCNN⁸, and MVDAN¹³, it improves ACC by 0.3% to 4.4% and MAP by 0.1% to 8.5%.

HTML全文

参考文献(0)

施引文献

资源附件(0)