###
计算机系统应用英文版:2021,30(4):146-152
←前一篇   |   后一篇→
本文二维码信息
码上扫一扫!
基于多模态特征学习的人体行为识别方法
(上海电力大学 计算机科学与技术学院, 上海 200090)
Human Action Recognition Algorithm Based on Multi-Modal Features Learning
(College of Computer and Science, Shanghai University of Electric Power, Shanghai 200090, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 920次   下载 2934
Received:August 25, 2020    Revised:September 15, 2020
中文摘要: 由于从单一行为模态中获取的特征难以准确地表达复杂的人体动作, 本文提出基于多模态特征学习的人体行为识别算法. 首先采用两条通道分别提取行为视频的RGB特征和3D骨骼特征, 第1条通道C3DP-LA网络由两部分组成: (1) 包含时空金字塔池化(Spatial Temporal Pyramid Pooling, STPP)的改进3D CNN; (2) 基于时空注意力机制的LSTM, 第2条通道为时空图卷积网络(ST-GCN), 然后, 本文将提取到的两种特征融合使其优势互补, 最后用Softmax分类器对融合特征进行分类, 并在公开数据集UCF101和NTU RGB + D上验证. 实验表明, 本文提出的方法与现有行为识别算法相比具有较高的识别准确度.
Abstract:Since the features obtained from a single action mode fail to accurately express complex human actions, this study proposes a recognition algorithm for human actions based on multi-modal feature learning. First, two channels extract the RGB and 3D skeletal features from the action video. The first channel, i.e., the C3DP-LA network, consists of an improved 3D CNN with Spatial Temporal Pyramid Pooling (STPP) and LSTM based on spatial-temporal attention. The second channel is the Spatial-Temporal Graph Convolutional Network (ST-GCN). Then the two extracted features are fused and classified by Softmax. Furthermore, the proposed algorithm is verified on the public data sets UCF101 and NTU RGB+D. The results show that this algorithm has higher recognition accuracy than its counterparts.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61672337)
引用文本:
周雪雪,雷景生,卓佳宁.基于多模态特征学习的人体行为识别方法.计算机系统应用,2021,30(4):146-152
ZHOU Xue-Xue,LEI Jing-Sheng,ZHUO Jia-Ning.Human Action Recognition Algorithm Based on Multi-Modal Features Learning.COMPUTER SYSTEMS APPLICATIONS,2021,30(4):146-152