Human Action Recognition Algorithm Based on Multi-Modal Features Learning

doi:10.15888/j.cnki.csa.007875

AIPUB归智期刊联盟

WeChat

Mobile website

2025-4-25- 17

Home > Archive>Volume 30, Issue 4, 2021 >146-152. DOI:10.15888/j.cnki.csa.007875

PDF HTML XML Export Cite reminder

Human Action Recognition Algorithm Based on Multi-Modal Features Learning
DOI:
                        10.15888/j.cnki.csa.007875
                    
CSTR:
                        [cstr]
                    
Author:
                        ZHOU Xue-XueZHOU Xue-Xue
College of Computer and Science, Shanghai University of Electric Power, Shanghai 200090, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
LEI Jing-ShengLEI Jing-Sheng
College of Computer and Science, Shanghai University of Electric Power, Shanghai 200090, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
ZHUO Jia-NingZHUO Jia-Ning
College of Computer and Science, Shanghai University of Electric Power, Shanghai 200090, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

Since the features obtained from a single action mode fail to accurately express complex human actions, this study proposes a recognition algorithm for human actions based on multi-modal feature learning. First, two channels extract the RGB and 3D skeletal features from the action video. The first channel, i.e., the C3DP-LA network, consists of an improved 3D CNN with Spatial Temporal Pyramid Pooling (STPP) and LSTM based on spatial-temporal attention. The second channel is the Spatial-Temporal Graph Convolutional Network (ST-GCN). Then the two extracted features are fused and classified by Softmax. Furthermore, the proposed algorithm is verified on the public data sets UCF101 and NTU RGB+D. The results show that this algorithm has higher recognition accuracy than its counterparts.

Key words:action recognition;improved 3D CNN;Spatial-Temporal Attention (ST-Att);Spatial-Temporal Graph Convolutional Network (ST-GCN);feature fusion

Get Citation

周雪雪,雷景生,卓佳宁.基于多模态特征学习的人体行为识别方法.计算机系统应用,2021,30(4):146-152

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:August 25,2020
Revised:September 15,2020
Adopted:
Online: March 31,2021
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063