本文已被:浏览 663次 下载 1452次
Received:March 17, 2023 Revised:April 20, 2023
Received:March 17, 2023 Revised:April 20, 2023
中文摘要: 手部姿态估计在人机交互、手功能评估、虚拟现实和增强现实等应用中发挥着重要作用, 为此本文提出了一种新的手部姿态估计方法, 以解决手部区域在大多数图像中占比较小和已有单视图关键点检测算法无法应对遮挡情况的问题. 所提方法首先通过引入Bayesian卷积网络的语义分割模型提取手部目标区域, 在此基础上针对手部定位结果, 利用所提基于注意力机制和级联引导策略的新模型以获得较为准确的手部二维关键点检测结果.然后提出了一种利用立体视觉算法计算关键点深度信息的深度网络, 并在深度估计中提供视角自学习的功能. 该方式以三角测量为基础, 利用RANSAC算法对测量结果进行校准. 最后经过多任务学习和重投影训练对手部关键点的3D检测结果进行优化, 最终提取手部关键点的三维姿态信息. 实验结果表明: 相比于已有的一些代表性人手区域检测算法, 本文方法在人手区域检测上的平均检测精度和运算时间上有一定的改善. 此外, 从本文所提姿态估计方法与已有其他方法的平均端点误差(EPE_mean)和PCK曲线下方面积(AUC)这些指标的对比结果来看, 本文方法的关键点检测性能更优, 因而能获得更好的手部姿态估计结果.
中文关键词: 人手区域提取|关键点检测|多视角学习|手部姿态估计
Abstract:Hand pose estimation plays an important role in human-computer interaction, hand function assessment, virtual reality, and augmented reality. Therefore, a new hand pose estimation method is proposed to handle the relatively small proportion of hand region in most images and the occlusion problem of single-view keypoint detection algorithms. The proposed method first extracts the hand target region by using a semantic segmentation model which introduces the Bayesian convolutional neural networks. According to the hand localization result, the proposed method adopts a new model based on the attention mechanism and cascade guidance strategy to obtain accurate 2D hand keypoint detection results. Then, the proposed method uses a deep network based on a stereo vision algorithm to calculate the depth information of the keypoints, and the view self-learning function is provided in depth estimation. The algorithm uses triangulation as the foundation, and the RANSAC algorithm is used to correct the measurement results. Finally, the 3D hand keypoint detection results can be optimized by using multi-task learning and reprojection training, and the 3D pose of the hand keypoints can be obtained. Experimental results show that compared with some representative hand region detection algorithms, the proposed method has a significant improvement in the average detection precision and running time for hand regions. In addition, in terms of the end-point-error mean (EPE_mean) and the area under PCK curve (AUC) of different pose estimation methods, it can be seen that the keypoint detection performance of the proposed method is better. Thus, a better hand pose estimation result can be obtained.
文章编号: 中图分类号: 文献标志码:
基金项目:长沙市自然科学基金(kq2208286); 湖南省自然科学基金(2023JJ30697)
引用文本:
徐梓雄,郭璠,王宗雨,唐琎.基于多视角学习策略的手部姿态估计.计算机系统应用,2023,32(10):22-33
XU Zi-Xiong,GUO Fan,WANG Zong-Yu,TANG Jin.Hand Pose Estimation Based on Multi-view Learning Strategy.COMPUTER SYSTEMS APPLICATIONS,2023,32(10):22-33
徐梓雄,郭璠,王宗雨,唐琎.基于多视角学习策略的手部姿态估计.计算机系统应用,2023,32(10):22-33
XU Zi-Xiong,GUO Fan,WANG Zong-Yu,TANG Jin.Hand Pose Estimation Based on Multi-view Learning Strategy.COMPUTER SYSTEMS APPLICATIONS,2023,32(10):22-33