###

计算机系统应用英文版:2022,31(1):204-211

←前一篇 | 后一篇→

码上扫一扫！

下载全文

基于时移和片组注意力融合的双流行为识别网络

肖子凡^1,2,3, 刘逸群⁴, 李楚溪⁵, 张力⁶, 王守岩^1,2,3, 肖晓^2,3

(1.复旦大学工程与应用技术研究院上海智能机器人工程技术研究中心, 上海 200433;2.计算神经科学与类脑智能教育部重点实验室(复旦大学), 上海 200433;3.复旦大学类脑智能科学与技术研究院, 上海 200433;4.复旦大学计算机科学技术学院上海市智能信息处理重点实验室, 上海 200433;5.复旦大学信息科学与工程学院微纳中心, 上海 200433;6.复旦大学大数据学院, 上海 200433)

Two-stream Action Recognition Network Based on Temporal Shift and Split Attention

XIAO Zi-Fan^1,2,3, LIU Yi-Qun⁴, LI Chu-Xi⁵, ZHANG Li⁶, WANG Shou-Yan^1,2,3, XIAO Xiao^2,3

(1.Shanghai Engineering Research Center of AI & Robotics, Academy of Engineering and Technology, Fudan University, Shanghai 200433, China;2.Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China;3.Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China;4.Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China;5.Micro Nano System Center, School of Information Science and Technology, Fudan University, Shanghai 200433, China;6.School of Data Science, Fudan University, Shanghai 200433, China)

摘要

图/表

参考文献

相似文献

本文已被：浏览 960次下载 1806次
Received:March 20, 2021 Revised:April 16, 2021

中文摘要: 基于深度学习的行为识别算法往往由于复杂的网络设计而难以在实际应用中达到快速、准确的识别效果.针对以上情况，提出一种轻量型的基于时移和片组注意力融合的端到端双流神经网络模型.算法在RGB与光流分支网络中，采用时间稀疏分组随机采样策略实现长时程建模，利用时移模块在时间维度上置换部分通道从而结合邻帧信息来提升时序表征能力，同时通过多路径及特征图注意融合的片组注意力模块提升网络的识别性能.实验表明，模型在行为识别公共数据集UCF101及HMDB51上分别达到了95.00%和72.55%的识别准确率.

中文关键词: 行为识别双流深度网络时移模块片组注意力

Abstract:The deep learning-based algorithms of action recognition are often difficult to achieve fast performance and high accuracy due to the complexity of neural networks. In view of this, we modularize the existing temporal shift and split attention module as an end-to-end trainable block which can be easily plugged into the classical two-stream action recognition pipeline. In the RGB and optical flow branch network, we adopt a random sampling strategy with sparse temporal grouping to realize long-term modeling. Furthermore, we use the Temporal Shift module to replace some channels in the time dimension so as to enhance the sequential characterization ability with information of adjacent frames. In addition, the Split Attention module integrating multi-paths and feature map attention mechanism improves the recognition performance of the network. Experiments show that our method achieves appealing performance on two public benchmark datasets including UCF101 (recognition accuracy of 95.00%) and HMDB51 (recognition accuracy of 72.55%), demonstrating its effectiveness.

keywords: action recognition two-stream deep network temporal shift module split attention

文章编号： 中图分类号： 文献标志码：

基金项目:国家重点研发计划（2019YFA0709504）；国家自然科学基金青年项目（31900719）；上海市科技人才计划启明星项目（19QA1401400）；上海市市级重大科技专项（2018SHZDZX01）

Author Name	Affiliation	E-mail
XIAO Zi-Fan	Shanghai Engineering Research Center of AI & Robotics, Academy of Engineering and Technology, Fudan University, Shanghai 200433, China Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China
LIU Yi-Qun	Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China
LI Chu-Xi	Micro Nano System Center, School of Information Science and Technology, Fudan University, Shanghai 200433, China
ZHANG Li	School of Data Science, Fudan University, Shanghai 200433, China
WANG Shou-Yan	Shanghai Engineering Research Center of AI & Robotics, Academy of Engineering and Technology, Fudan University, Shanghai 200433, China Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China
XIAO Xiao	Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China	xiaoxiao@fudan.edu.cn

Author Name	Affiliation	E-mail
XIAO Zi-Fan	Shanghai Engineering Research Center of AI & Robotics, Academy of Engineering and Technology, Fudan University, Shanghai 200433, China Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China
LIU Yi-Qun	Shanghai Key Laboratory of Intelligent Information Processing, School of Computer Science, Fudan University, Shanghai 200433, China
LI Chu-Xi	Micro Nano System Center, School of Information Science and Technology, Fudan University, Shanghai 200433, China
ZHANG Li	School of Data Science, Fudan University, Shanghai 200433, China
WANG Shou-Yan	Shanghai Engineering Research Center of AI & Robotics, Academy of Engineering and Technology, Fudan University, Shanghai 200433, China Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China
XIAO Xiao	Key Laboratory of Computational Neuroscience and Brain-inspired Intelligence, Ministry of Education (Fudan University), Shanghai 200433, China Institute of Science and Technology for Brain-inspired Intelligence, Fudan University, Shanghai 200433, China	xiaoxiao@fudan.edu.cn

引用文本：
肖子凡,刘逸群,李楚溪,张力,王守岩,肖晓.基于时移和片组注意力融合的双流行为识别网络.计算机系统应用,2022,31(1):204-211
XIAO Zi-Fan,LIU Yi-Qun,LI Chu-Xi,ZHANG Li,WANG Shou-Yan,XIAO Xiao.Two-stream Action Recognition Network Based on Temporal Shift and Split Attention.COMPUTER SYSTEMS APPLICATIONS,2022,31(1):204-211