本文已被:浏览 635次 下载 1599次
Received:August 06, 2023 Revised:September 09, 2023
Received:August 06, 2023 Revised:September 09, 2023
中文摘要: 伪装目标检测(COD)旨在精确且高效地检测出与背景高度相似的伪装物体, 其方法可为物种保护、医学病患检测和军事监测等领域提供助力, 具有较高的实用价值. 近年来, 采用深度学习方法进行伪装目标检测成为一个比较新兴的研究方向. 但现有大多数COD算法都是以卷积神经网络(CNN)作为特征提取网络, 并且在结合多层次特征时, 忽略了特征表示和融合方法对检测性能的影响. 针对基于卷积神经网络的伪装目标检测模型对被检测目标的全局特征提取能力较弱问题, 提出一种基于Transformer的跨尺度交互学习伪装目标检测方法. 该模型首先提出了双分支特征融合模块, 将经过迭代注意力的特征进行融合, 更好地融合高低层特征; 其次引入了多尺度全局上下文信息模块, 充分联系上下文信息增强特征; 最后提出了多通道池化模块, 能够聚焦被检测物体的局部信息, 提高伪装目标检测准确率. 在CHAMELEON、CAMO以及COD10K数据集上的实验结果表明, 与当前主流的伪装物体检测算法相比较, 该方法生成的预测图更加清晰, 伪装目标检测模型能取得更高精度.
Abstract:Camouflage object detection (COD) aims to accurately and efficiently detect camouflaged objects that are highly similar to the background. Its method can assist in species protection, medical patient detection, and military monitoring, possessing high practical value. In recent years, using deep learning methods to detect camouflaged objects has become an emerging research direction. However, most existing COD algorithms apply a convolutional neural network (CNN) as the feature extraction network and ignore the influence of feature representation and fusion methods on detection performance when combining multi-level features. As the camouflage object detection model based on CNN has a weak ability to extract the global features of the detected object, this study proposes a cross scale interactive learning method for camouflage object detection based on Transformer. The model first puts forward a dual branch feature fusion module, which fuses features that have undergone iterative attention to better fuse high- and low-level features. Secondly, a multi-scale global context information module is introduced to fully integrate context information to enhance features. Finally, a multi-channel pooling module is proposed, which can focus on the local information of the detected object and improve the accuracy of camouflage target detection. The experimental results on the CHAMELEON, CAMO, and COD10K datasets show that this method generates clearer prediction maps and can achieve higher accuracy in camouflage object detection models than current mainstream camouflage object detection algorithms.
keywords: deep learning camouflage object detection (COD) visual characteristic pyramid convolutional neural network (CNN) feature fusion
文章编号: 中图分类号: 文献标志码:
基金项目:国家自然科学基金面上项目(42271409); 辽宁省高等学校基本科研项目(LIKMZ20220699)
引用文本:
李建东,王岩,曲海成.基于Transformer的跨尺度交互学习伪装目标检测.计算机系统应用,2024,33(2):115-124
LI Jian-Dong,WANG Yan,QU Hai-Cheng.Transformer-based Cross Scale Interactive Learning for Camouflage Object Detection.COMPUTER SYSTEMS APPLICATIONS,2024,33(2):115-124
李建东,王岩,曲海成.基于Transformer的跨尺度交互学习伪装目标检测.计算机系统应用,2024,33(2):115-124
LI Jian-Dong,WANG Yan,QU Hai-Cheng.Transformer-based Cross Scale Interactive Learning for Camouflage Object Detection.COMPUTER SYSTEMS APPLICATIONS,2024,33(2):115-124