基于自适应图拓扑细化与跨序列对比学习的骨架动作识别优化

doi:10.15888/j.cnki.csa.010012

AIPUB归智期刊联盟

微信公众号

网站二维码

首页 > 过刊浏览>2025年第34卷第11期 >107-114. DOI:10.15888/j.cnki.csa.010012

PDF HTML阅读 XML下载导出引用引用提醒

基于自适应图拓扑细化与跨序列对比学习的骨架动作识别优化
DOI:
                        10.15888/j.cnki.csa.010012
                    
CSTR:
                        32024.14.csa.010012
                    
作者:
                        
                        
                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金委员会-中国国家铁路集团有限公司铁路基础研究联合基金 (U2268206)

Optimization of Skeleton Action Recognition Based on Adaptive Graph Topology Refinement and Cross-sequence Contrastive Learning

Author:

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

为解决骨架动作识别领域中模糊样本导致的复杂动作识别难题, 本文提出了一种自适应图拓扑细化(AGTR)和跨序列对比学习(CSCL)的协同优化框架. 首先, AGTR 通过多头注意力机制动态构建关节关系拓扑, 打破了传统图卷积依赖固定结构的局限性, 实现了多视角特征的解耦. 其次, CSCL 融合了片段级、实例级和类原型的比较损失, 结合动态困难样本挖掘策略, 增强了模型对时间语义一致性和长尾分布的建模能力. 经过严密实验, 本文在 NTU RGB+D 120 数据集的CSub协议下达到89.8%的准确率, 超越了基于超图与Transformer的先进方法——Hyperformer (86.9%) 2.9个百分点, 在噪声和遮挡的环境下鲁棒性提高了18.8%, 且参数量和计算效率得到平衡和优化(3.1 GFLOPs, 25 f/s). 本研究为复杂运动的理解提供了一种高精度、可解释、易于部署的解决方案, 在智能医疗监测和工业人机交互应用领域具有良好的前景.

Abstract:

To address the challenges in skeleton action recognition caused by complex actions and ambiguous samples, this study proposes a co-optimization framework that combines adaptive graph topology refinement (AGTR) and cross-sequence contrastive learning (CSCL). AGTR leverages multi-head attention to dynamically construct joint connectivity graphs, overcoming the limitations of fixed structures and enabling the decoupling of multi-view features. CSCL integrates segment-level, instance-level, and prototype-level contrastive losses, coupled with dynamic hard sample mining, to improve the modeling of temporal semantic consistency and long-tailed distributions. Extensive experiments on the NTU RGB+D 120 dataset demonstrate that the proposed method achieves an accuracy of 89.8%, surpassing the hypergraph and Transformer-based method, Hyperformer (86.9%), by 2.9 percentage points. It also enhances robustness under noise and occlusion by 18.8%, while balancing efficiency (3.1 GFLOPs, 25 f/s). This study offers a high-accuracy, interpretable, and deployable solution for complex action recognition, with significant potential in intelligent healthcare and industrial human-robot interaction.

参考文献

相似文献

引证文献

引用本文

董佳康,何涛.基于自适应图拓扑细化与跨序列对比学习的骨架动作识别优化.计算机系统应用,2025,34(11):107-114

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2025-04-03
最后修改日期:2025-04-29
录用日期:
在线发布日期: 2025-09-30
出版日期:

微信公众号

网站二维码

引用本文

分享

相关视频

文章指标

历史

文章二维码