###

计算机系统应用英文版:2023,32(9):211-220

View/Add Comment 过刊浏览高级检索 HTML

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于视角置信度和注意力的暴力行为识别

夏良伟, 朱明

(中国科学技术大学信息科学技术学院, 合肥 230026)

Violence Recognition Based on View Confidence and Attention

XIA Liang-Wei, ZHU Ming

(School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 862次下载 1752次
Received:February 20, 2023 Revised:March 20, 2023

中文摘要: 暴力行为容易出现遮挡情况, 识别准确率较低. 目前, 一些算法加入多视角视频输入来解决遮挡问题, 以等量权重将所有视角数据融合, 但是不同视角的视频因拍摄距离和遮挡情况本身就对识别存在差异性. 针对该问题, 本文提出一种基于视角置信度和注意力的暴力行为识别方法, 提高暴力识别的准确率. 本文将时序差分模块TDM的输入扩展成多视角, 将通道注意力机制运用在片段维度来增强TDM中跨段特征提取能力, 通过背景抑制方法突显移动目标的纹理特征并计算出每个视角图像的置信度, 引入双线性池化方法融合多视角视频特征, 根据视角置信度分配每个视角局部特征的权重. 本文在公开数据集CASIA-Action和自制数据集上进行了验证. 实验表明, 本文提出的视角置信度方法优于改进前的双线性池化方法, 暴力行为准确率相较于现有的行为识别方法取得了更好的效果.

中文关键词: 暴力行为识别注意力双线性池化视角置信度

Abstract:Violence can be easily occluded, and the recognition accuracy is low. At present, some algorithms add multi-view video input to solve the occlusion problem and fuse all view data with equal weight. However, video from different views differs in recognition due to shooting distance and occlusion itself. To solve this problem, this study proposes a violence recognition method based on view confidence and attention to improve the accuracy of violence recognition. The input of the temporal difference module (TDM) is expanded to a multi-view angle. The channel attention mechanism is applied to the segment dimension to enhance the ability of cross-segment feature extraction in TDM. The background suppression method is used to highlight the texture features of moving objects and calculate the image confidence of each view. The bilinear pooling method is introduced to fuse multi-view video features, and the weight of local features of each view is assigned according to the view confidence. In this study, validation is performed on both the public dataset CASIA-Action and the self-made dataset. Experiments show that the view confidence method proposed in this study is better than the bilinear pooling method before improvement, and the accuracy of violence recognition is better than that of the existing behavior recognition methods.

keywords: violence recognition attention bilinear pooling view confidence

文章编号： 中图分类号： 文献标志码：

基金项目:科技创新特区计划(20-163-14-LZ-001-004-01)

Author Name	Affiliation	E-mail
XIA Liang-Wei	School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China	xlw1998@mail.ustc.edu.cn
ZHU Ming	School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China

Author Name	Affiliation	E-mail
XIA Liang-Wei	School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China	xlw1998@mail.ustc.edu.cn
ZHU Ming	School of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China

引用文本：
夏良伟,朱明.基于视角置信度和注意力的暴力行为识别.计算机系统应用,2023,32(9):211-220
XIA Liang-Wei,ZHU Ming.Violence Recognition Based on View Confidence and Attention.COMPUTER SYSTEMS APPLICATIONS,2023,32(9):211-220