###
计算机系统应用英文版:2024,33(8):250-256
本文二维码信息
码上扫一扫!
基于多层空间特征融合的三维人体姿态估计
(南京邮电大学 计算机学院, 南京 210023)
3D Human Pose Estimation Based on Multi-layer Spatial Feature Fusion
(School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 156次   下载 640
Received:February 26, 2024    Revised:March 28, 2024
中文摘要: 在三维人体姿态估计任务当中, 人体关节之间的连接关系形成了一种复杂的拓扑结构, 利用图卷积网络对该结构进行建模, 可以有效捕捉局部关节间的联系; 尽管不相邻关节之间没有直接的物理连接, 但由于人体的运动和姿态受到生物力学约束以及人体关节之间的协同作用, 利用Transformer编码器建立关节之间的上下文关系, 可以更好地推断出人体姿态; 在大模型的背景下, 如何在保证模型性能的同时, 降低参数量, 也显得尤为重要. 针对上述问题, 设计了一个基于图卷积和Transformer的多层空间特征融合网络模型(MLSFFN), 在使用相对少量的参数基础上, 有效地融合了局部和全局空间特征. 实验结果表明, 本文提出的方法在仅需2.1M参数量的情况下, 在Human3.6M数据集上达到了49.9 mm的平均每关节误差(MPJPE). 此外, 模型在MPI-INF-3DHP数据集上也展示出了较强的泛化能力.
Abstract:In the task of 3D human pose estimation, the complex topology formed by the connection relationship between human joints presents a challenge. Effective capture of the connections between local joints is possible through modeling this structure with a graph convolutional network. Although non-adjacent joints lack direct physical connections, Transformer encoders establish contextual relationships between joints, which is crucial for better human posture inference due to the biomechanical constraints influencing human motion and pose, as well as the synergistic interaction of human joints. Balancing model performance with a reduction in the number of parameters is of particular importance for large-scale models. To tackle these challenges, a multi-layer spatial feature fusion network model (MLSFFN) based on graph convolution and Transformer is designed. This model proficiently fuses local and global spatial features with a relatively minimal parameter set. Experimental results demonstrate that the proposed method achieves a mean point per joint error (MPJPE) of 49.9 mm on the Human3.6M dataset with only 2.1M parameters. Moreover, the model demonstrates a robust generalization capability.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
梁桉源,肖学中.基于多层空间特征融合的三维人体姿态估计.计算机系统应用,2024,33(8):250-256
LIANG An-Yuan,XIAO Xue-Zhong.3D Human Pose Estimation Based on Multi-layer Spatial Feature Fusion.COMPUTER SYSTEMS APPLICATIONS,2024,33(8):250-256