###
计算机系统应用英文版:2022,31(11):175-183
←前一篇   |   后一篇→
本文二维码信息
码上扫一扫!
基于对比学习的细粒度遮挡人脸表情识别
(江苏大学 计算机科学与通信工程学院, 镇江 212013)
Fine-grained Occluded Facial Expression Recognition Based on Contrastive Learning
(School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 902次   下载 1629
Received:March 05, 2022    Revised:April 12, 2022
中文摘要: 和实验室环境不同, 现实生活中的人脸表情图像场景复杂, 其中最常见的局部遮挡问题会造成面部外观的显著改变, 使得模型提取到的全局特征包含与情感无关的冗余信息从而降低了判别力. 针对此问题, 本文提出了一种结合对比学习和通道-空间注意力机制的人脸表情识别方法, 学习各局部显著情感特征并关注局部特征与全局特征之间的关系. 首先引入对比学习, 通过特定的数据增强方法设计新的正负样本选取策略, 对大量易获得的无标签情感数据进行预训练, 学习具有感知遮挡能力的表征, 再将此表征迁移到下游人脸表情识别任务以提高识别性能. 在下游任务中, 将每张人脸图像的表情分析问题转化为多个局部区域的情感检测问题, 使用通道-空间注意力机制学习人脸不同局部区域的细粒度注意力图, 并对加权特征进行融合, 削弱遮挡内容带来的噪声影响, 最后提出约束损失联合训练, 优化最终用于分类的融合特征. 实验结果表明, 无论是在公开的非遮挡人脸表情数据集(RAF-DB和FER2013)还是人工合成的遮挡人脸表情数据集上, 所提方法都取得了与现有先进方法可媲美的结果.
Abstract:Different from the laboratory environment, the scenes of facial expression images in real life are complex, and local occlusion, the most common problem, will cause a significant change in the facial appearance. As a result, the global feature extracted by a model contains redundant information unrelated to emotions, which reduces the discrimination of the model. Considering this problem, a facial expression recognition method combining contrastive learning and the channel-spatial attention mechanism is proposed in this study, which learns local salient emotion features and pays attention to the relationship between local features and global features. Firstly, contrastive learning is introduced. A new positive and negative sample selection strategy is designed through a specific data augmentation method, and a large amount of easily accessible unlabeled emotion data is pre-trained to learn the representation with occlusion-aware ability. Then, the representation is transferred to the downstream facial expression recognition task to improve recognition performance. In the downstream task, the expression analysis of each face image is transformed into the emotion detection of multiple local regions. The fine-grained attention maps of different local regions of a face are learned using the channel-spatial attention mechanism, and the weighted features are fused to weaken the noise effect caused by the occlusion content. Finally, the constraint loss for joint training is proposed to optimize the final fusion feature for classification. The experimental results indicate that the proposed method achieves comparable results to existing state-of-the-art methods on both public non-occluded facial expression datasets (RAF-DB and FER2013) and synthetic occluded facial expression datasets.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
奚琰.基于对比学习的细粒度遮挡人脸表情识别.计算机系统应用,2022,31(11):175-183
XI Yan.Fine-grained Occluded Facial Expression Recognition Based on Contrastive Learning.COMPUTER SYSTEMS APPLICATIONS,2022,31(11):175-183