基于YOLO的多模态特征差分注意融合行人检测

doi:10.15888/j.cnki.csa.009022

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月14日 13:23 星期一

首页 > 过刊浏览>2023年第32卷第4期 >329-338. DOI:10.15888/j.cnki.csa.009022

PDF HTML阅读 XML下载导出引用引用提醒

基于YOLO的多模态特征差分注意融合行人检测
DOI:
                        10.15888/j.cnki.csa.009022
                    
CSTR:
                        
                    
作者:
                        王钊王钊
陆军工程大学 指挥控制工程学院, 南京 210007
在期刊界中查找
在百度中查找
在本站中查找
解文彬解文彬
陆军工程大学 指挥控制工程学院, 南京 210007
在期刊界中查找
在百度中查找
在本站中查找
文江文江
陆军研究院 工程设计研究所, 广州 510515
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:

Pedestrian Detection Based on Multimodal Feature Differential Attention Fusion and YOLO

Author:

WANG Zhao
WANG Zhao
Command and Control Engineering Academy, Army Engineering University of PLA, Nanjing 210007, China
在期刊界中查找
在百度中查找
在本站中查找
XIE Wen-Bin
XIE Wen-Bin
Command and Control Engineering Academy, Army Engineering University of PLA, Nanjing 210007, China
在期刊界中查找
在百度中查找
在本站中查找
WEN Jiang
WEN Jiang
Engineering Design Institute, Army Research Institute, Guangzhou 510515, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

针对可见光模态与热红外模态间的差异问题和如何充分利用多模态信息进行行人检测, 本文提出了一种基于YOLO的多模态特征差分注意融合行人检测方法. 该方法首先利用YOLOv3深度神经网络的特征提取主干分别提取多模态特征; 其次在对应多模态特征层之间嵌入模态特征差分注意模块充分挖掘模态间的差异信息, 并经过注意机制强化差异特征表示进而改善特征融合质量, 再将差异信息分别反馈到多模态特征提取主干中, 提升网络对多模态互补信息的学习融合能力; 然后对多模态特征进行分层融合得到融合后的多尺度特征; 最后在多尺度特征层上进行目标检测, 预测行人目标的概率和位置. 在KAIST和LLVIP公开多模态行人检测据集上的实验结果表明, 提出的多模态行人检测方法能有效解决模态间的差异问题, 实现多模态信息的充分利用, 具有较高的检测精度和速度, 具有实际应用价值.

关键词:多模态;YOLOv3;特征差分;注意机制;行人检测

Abstract:

In order to address the difference between visible light modality and thermal infrared modality and make full use of multimodal information to perform pedestrian detection, this study proposes a multimodal feature differential attention fusion pedestrian detection method based on YOLO. The method first uses the feature extraction backbone of the YOLOv3 deep neural network to extract multimodal features respectively. Second, the differential attention module of modal features is embedded between the corresponding multimodal feature layers to fully mine the difference information between modalities, and the difference feature representation is strengthened through the attention mechanism, so as to improve the quality of feature fusion. Then, the difference information is fed back to the multimodal feature extraction backbone to improve the network’s ability to learn and fuse multimodal complementary information. In addition, the multimodal features are fused in layers to obtain the multi-scale features. Finally, target detection is performed on the multi-scale feature layer to predict the probability and location of pedestrian targets. The experimental results on the public multimodal pedestrian detection datasets of KAIST and LLVIP show that the proposed multimodal pedestrian detection method can effectively address the difference between modalities and realize the full use of multimodal information. Furthermore, it has high detection accuracy and speed and is of practical application value.

Key words:multimodal;YOLOv3;feature differential;attention mechanism;pedestrian detection

引用本文

王钊,解文彬,文江.基于YOLO的多模态特征差分注意融合行人检测.计算机系统应用,2023,32(4):329-338

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2022-08-19
最后修改日期:2022-09-22
录用日期:
在线发布日期: 2022-12-23
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码