基于加权多头并行注意力的局部遮挡面部表情识别
作者:

Facial Expression Recognition with Local Occlusion Based on Weighted Multi-head Parallel Attention
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [30]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    面部表情识别在诸多领域具有广泛的应用价值, 但在识别过程中局部遮挡会导致面部难以提取有效的表情识别特征, 而局部遮挡的面部表情识别可能需要多个区域的表情特征, 单一的注意力机制无法同时关注面部多个区域特征. 针对这一问题, 本文提出了一种基于加权多头并行注意力的局部遮挡面部表情识别模型, 该模型通过并行多个通道-空间注意力提取局部未被遮挡的多个面部区域表情特征, 有效缓解了遮挡对表情识别的干扰, 大量的实验结果表明, 本文的方法相比于很多先进的方法取得了最优的性能, 在RAF-DB和FERPlus上的准确率分别为89.54%、89.13%, 在真实遮挡的数据集Occlusion-RAF-DB和Occlusion-FERPlus的准确率分别为87.47%、86.28%. 因此, 本文的方法具有很强的鲁棒性.

    Abstract:

    Facial expression recognition (FER) has widespread application significance in many fields, but it is difficult to extract effective FER features due to local occlusion during the recognition. FER with local occlusion may require expression features of multiple regions, and a single attention mechanism cannot focus on the features of multiple facial regions simultaneously. To this end, this study proposes a local occlusion FER model based on weighted multi-head parallel attention. The model extracts the expression features of multiple facial regions that are not occluded by multiple channels in parallel-spatial attention, alleviating the occlusion interference on expression recognition. A large number of experiments show that the proposed method yields the best performance compared with many advanced methods, and the accuracy on RAF-DB and FERPlus is 89.54% and 89.13%, respectively. On the occluded datasets Occlusion-RAF-DB and Occlusion-FERPlus, the accuracy is 87.47% and 86.28%, respectively. Therefore, this method has strong robustness.

    参考文献
    [1] Shan CF, Gong SG, McOwan PW. Facial expression recognition based on local binary patterns: A comprehensive study. Image and Vision Computing, 2009, 27(6): 803–816.
    [2] Zhong L, Liu QS, Yang P, et al. Learning active facial patches for expression analysis. Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence: IEEE, 2012. 2562–2569.
    [3] Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus: IEEE, 2014. 580–587.
    [4] Wang JH, Ding HY, Wang SF. Occluded facial expression recognition using self-supervised learning. Proceedings of the 16th Asian Conference on Computer Vision. Macao: Springer, 2022. 1077–1092.
    [5] Liu C, Hirota K, Dai YP. Patch attention convolutional vision transformer for facial expression recognition with occlusion. Information Sciences, 2023, 619: 781–794.
    [6] Zhang XH, Zhang XM, Zhou JZ, et al. Occlusion-aware facial expression recognition based region re-weight network. Proceedings of the 18th Pacific Rim International Conference on Artificial Intelligence. Hanoi: Springer, 2021. 209–222.
    [7] Lu Y, Wang SG, Zhao WT, et al. WGAN-based robust occluded facial expression recognition. IEEE Access, 2019, 7: 93594–93610.
    [8] Fan YR, Li VOK, Lam JCK. Facial expression recognition with deeply-supervised attention network. IEEE Transactions on Affective Computing, 2022, 13(2): 1057–1071.
    [9] Wang K, Peng XJ, Yang JF, et al. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Transactions on Image Processing, 2020, 29: 4057–4069.
    [10] Zhao ZQ, Liu QS, Wang SM. Learning deep global multi-scale and local attention features for facial expression recognition in the wild. IEEE Transactions on Image Processing, 2021, 30: 6544–6556.
    [11] Xue FL, Wang QC, Guo GD. TransFER: Learning relation-aware facial expression representations with transformers. Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision. Montreal: IEEE, 2021. 3581–3590.
    [12] Liu HW, Cai HL, Lin QC, et al. Adaptive multilayer perceptual attention network for facial expression recognition. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(9): 6253–6266.
    [13] Ju LZ, Zhao X. Mask-based attention parallel network for in-the-wild facial expression recognition. Proceedings of the 2022 IEEE International Conference on Acoustics, Speech and Signal Processing. Singapore: IEEE, 2022. 2410–2414.
    [14] Gong WJ, Wang CQ, Jia JL, et al. Multi-feature fusion network for facial expression recognition in the wild. Journal of Intelligent & Fuzzy Systems, 2022, 42(6): 4999–5011.
    [15] Ruan LH, Han YX, Sun JR, et al. Facial expression recognition in facial occlusion scenarios: A path selection multi-network. Displays, 2022, 74: 102245.
    [16] 张本文, 高瑞玮, 乔少杰. 新型融合注意力机制的遮挡面部表情识别框架. 重庆理工大学学报(自然科学), 2023, 37(9): 217–226.
    [17] He KM, Zhang XY, Ren SQ, et al. Deep residual learning for image recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 770–778.
    [18] Hu J, Shen L, Sun G. Squeeze-and-excitation networks. Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018. 7132–7141.
    [19] Guo YD, Zhang L, Hu YX, et al. MS-Celeb-1M: A dataset and benchmark for large-scale face recognition. Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016. 87–102.
    [20] Li S, Deng WH, Du JP. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017. 2852–2861.
    [21] Barsoum E, Zhang C, Ferrer CC, et al. Training deep networks for facial expression recognition with crowd-sourced label distribution. Proceedings of the 18th ACM International Conference on Multimodal Interaction. Tokyo: ACM, 2016. 279–283.
    [22] Zhou BL, Khosla A, Lapedriza A, et al. Learning deep features for discriminative localization. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016. 2921–2929.
    [23] Ma H, Celik T, Li HC. Lightweight attention convolutional neural network through network slimming for robust facial expression recognition. Signal, Image and Video Processing, 2021, 15(7): 1507–1515.
    [24] Zhao ZQ, Liu QS, Zhou F. Robust lightweight facial expression recognition network with label distribution training. Proceedings of the 35th AAAI Conference on Artificial Intelligence. AAAI Press, 2021. 3510–3519.
    [25] Wang K, Peng XJ, Yang JF, et al. Suppressing uncertainties for large-scale facial expression recognition. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020. 6897–6906.
    [26] Ma FY, Sun B, Li ST. Facial expression recognition with visual transformers and attentional selective fusion. IEEE Transactions on Affective Computing, 2023, 14(2): 1236–1248.
    [27] Xia HY, Li CY, Tan YM, et al. Destruction and reconstruction learning for facial expression recognition. IEEE MultiMedia, 2021, 28(2): 20–28.
    [28] Gao WJ, Li L, Zhao HY. Facial expression recognition method based on SpResNet-ViT. Proceedings of the 2nd Asia-Pacific Conference on Communications Technology and Computer Science. Shenyang: IEEE, 2022. 182–187.
    [29] Jiang J, Deng WH. Disentangling identity and pose for facial expression recognition. IEEE Transactions on Affective Computing, 2022, 13(4): 1868–1878.
    [30] Gong WJ, Qian YR, Fan YY. MPCSAN: Multi-head parallel channel-spatial attention network for facial expression recognition in the wild. Neural Computing and Applications, 2023, 35(9): 6529–6543.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

郭胜,蔡姗,邹雪,周珍胜,王林.基于加权多头并行注意力的局部遮挡面部表情识别.计算机系统应用,2024,33(1):254-262

复制
分享
文章指标
  • 点击次数:492
  • 下载次数: 1234
  • HTML阅读次数: 729
  • 引用次数: 0
历史
  • 收稿日期:2023-06-24
  • 最后修改日期:2023-07-27
  • 在线发布日期: 2023-11-24
  • 出版日期: 2023-01-05
文章二维码
您是第10688979位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号