基于对比学习的细粒度遮挡人脸表情识别
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Fine-grained Occluded Facial Expression Recognition Based on Contrastive Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 增强出版
  • |
  • 文章评论
    摘要:

    和实验室环境不同, 现实生活中的人脸表情图像场景复杂, 其中最常见的局部遮挡问题会造成面部外观的显著改变, 使得模型提取到的全局特征包含与情感无关的冗余信息从而降低了判别力. 针对此问题, 本文提出了一种结合对比学习和通道-空间注意力机制的人脸表情识别方法, 学习各局部显著情感特征并关注局部特征与全局特征之间的关系. 首先引入对比学习, 通过特定的数据增强方法设计新的正负样本选取策略, 对大量易获得的无标签情感数据进行预训练, 学习具有感知遮挡能力的表征, 再将此表征迁移到下游人脸表情识别任务以提高识别性能. 在下游任务中, 将每张人脸图像的表情分析问题转化为多个局部区域的情感检测问题, 使用通道-空间注意力机制学习人脸不同局部区域的细粒度注意力图, 并对加权特征进行融合, 削弱遮挡内容带来的噪声影响, 最后提出约束损失联合训练, 优化最终用于分类的融合特征. 实验结果表明, 无论是在公开的非遮挡人脸表情数据集(RAF-DB和FER2013)还是人工合成的遮挡人脸表情数据集上, 所提方法都取得了与现有先进方法可媲美的结果.

    Abstract:

    Different from the laboratory environment, the scenes of facial expression images in real life are complex, and local occlusion, the most common problem, will cause a significant change in the facial appearance. As a result, the global feature extracted by a model contains redundant information unrelated to emotions, which reduces the discrimination of the model. Considering this problem, a facial expression recognition method combining contrastive learning and the channel-spatial attention mechanism is proposed in this study, which learns local salient emotion features and pays attention to the relationship between local features and global features. Firstly, contrastive learning is introduced. A new positive and negative sample selection strategy is designed through a specific data augmentation method, and a large amount of easily accessible unlabeled emotion data is pre-trained to learn the representation with occlusion-aware ability. Then, the representation is transferred to the downstream facial expression recognition task to improve recognition performance. In the downstream task, the expression analysis of each face image is transformed into the emotion detection of multiple local regions. The fine-grained attention maps of different local regions of a face are learned using the channel-spatial attention mechanism, and the weighted features are fused to weaken the noise effect caused by the occlusion content. Finally, the constraint loss for joint training is proposed to optimize the final fusion feature for classification. The experimental results indicate that the proposed method achieves comparable results to existing state-of-the-art methods on both public non-occluded facial expression datasets (RAF-DB and FER2013) and synthetic occluded facial expression datasets.

    参考文献
    相似文献
    引证文献
引用本文

奚琰.基于对比学习的细粒度遮挡人脸表情识别.计算机系统应用,2022,31(11):175-183

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-03-05
  • 最后修改日期:2022-04-12
  • 录用日期:
  • 在线发布日期: 2022-07-14
  • 出版日期:
您是第位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号