基于视听觉感知系统的多模态情感识别

doi:10.15888/j.cnki.csa.008235

AIPUB归智期刊联盟

微信公众号

网站二维码

2025年4月14日 11:30 星期一

首页 > 过刊浏览>2021年第30卷第12期 >218-225. DOI:10.15888/j.cnki.csa.008235

PDF HTML阅读 XML下载导出引用引用提醒

基于视听觉感知系统的多模态情感识别
DOI:
                        10.15888/j.cnki.csa.008235
                    
CSTR:
                        
                    
作者:
                        龙英潮龙英潮
华南师范大学 软件学院, 佛山 528225
在期刊界中查找
在百度中查找
在本站中查找
丁美荣丁美荣
华南师范大学 软件学院, 佛山 528225
在期刊界中查找
在百度中查找
在本站中查找
林桂锦林桂锦
华南师范大学 软件学院, 佛山 528225
在期刊界中查找
在百度中查找
在本站中查找
刘鸿业刘鸿业
华南师范大学 软件学院, 佛山 528225
在期刊界中查找
在百度中查找
在本站中查找
曾碧卿曾碧卿
华南师范大学 软件学院, 佛山 528225
在期刊界中查找
在百度中查找
在本站中查找

                    
作者单位:
作者简介:
通讯作者:
中图分类号:
基金项目:国家自然科学基金(61876067); 广东省普通高校人工智能重点领域专项(2019KZDZX1033); 广东省信息物理融合系统重点实验室建设专项(2020B1212060069)

Emotion Recognition Based on Visual and Audiovisual Perception System

Author:

LONG Ying-Chao
LONG Ying-Chao
School of Software, South China Normal University, Foshan 528225, China
在期刊界中查找
在百度中查找
在本站中查找
DING Mei-Rong
DING Mei-Rong
School of Software, South China Normal University, Foshan 528225, China
在期刊界中查找
在百度中查找
在本站中查找
LIN Gui-Jin
LIN Gui-Jin
School of Software, South China Normal University, Foshan 528225, China
在期刊界中查找
在百度中查找
在本站中查找
LIU Hong-Ye
LIU Hong-Ye
School of Software, South China Normal University, Foshan 528225, China
在期刊界中查找
在百度中查找
在本站中查找
ZENG Bi-Qing
ZENG Bi-Qing
School of Software, South China Normal University, Foshan 528225, China
在期刊界中查找
在百度中查找
在本站中查找

Affiliation:

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

文章评论

摘要:

情绪识别作为人机交互的热门领域, 其技术已经被应用于医学、教育、安全驾驶、电子商务等领域.情绪主要由面部表情、声音、话语等进行表达, 不同情绪表达时的面部肌肉、语气、语调等特征也不相同, 使用单一模态特征确定的情绪的不准确性偏高, 考虑到情绪表达主要通过视觉和听觉进行感知, 本文提出了一种基于视听觉感知系统的多模态表情识别算法, 分别从语音和图像模态出发, 提取两种模态的情感特征, 并设计多个分类器为单特征进行情绪分类实验, 得到多个基于单特征的表情识别模型. 在语音和图像的多模态实验中, 提出了晚期融合策略进行特征融合, 考虑到不同模型间的弱依赖性, 采用加权投票法进行模型融合, 得到基于多个单特征模型的融合表情识别模型. 本文使用AFEW数据集进行实验, 通过对比融合表情识别模型与单特征的表情识别模型的识别结果, 验证了基于视听觉感知系统的多模态情感识别效果要优于基于单模态的识别效果.

关键词:情感识别;模型融合;多模态;视听觉感知系统

Abstract:

As a hot spot of human-computer interaction, emotion recognition has been applied in many fields, such as medicine, education, safe driving and e-commerce. Emotions are mainly expressed by facial expression, voice, discourse and so on. Other characteristics such as facial muscles, mood and intonation vary when different kinds of emotions are expressed. Thus, the inaccuracy of emotions determined using a single modal feature is high. Considering that the expressed emotions are mainly perceived by vision and hearing, this study proposes a multimodal expression recognition algorithm based on an audiovisual perception system. Specifically, the emotion features of speech and image modalities are first extracted, and a plurality of classifiers are designed to perform emotion classification experiments for a single feature, from which multiple expression recognition models based on single features are obtained. In the multimodal experiments of speech and images, a late fusion strategy is put forward for feature fusion. Taking into account the weak dependence of different models, this work uses the weighted voting method for model fusion and obtains the integrated expression recognition model based on multiple single-feature models. The AFEW dataset is adopted for facial expression recognition in this study. The comparison of recognition results between the integrated model and the single-feature models for expression recognition verifies that the effect of multimodal emotion recognition based on the audiovisual perception system is better than that of single-modal emotion recognition.

Key words:emotion recognition;model fusion;multimodal;audiovisual perception system

引用本文

龙英潮,丁美荣,林桂锦,刘鸿业,曾碧卿.基于视听觉感知系统的多模态情感识别.计算机系统应用,2021,30(12):218-225

复制

文章指标

点击次数:
下载次数:
HTML阅读次数:
引用次数:

历史

收稿日期:2021-03-05
最后修改日期:2021-04-07
录用日期:
在线发布日期: 2021-12-10
出版日期:

微信公众号

网站二维码

引用本文

分享

文章指标

历史

文章二维码

微信公众号

网站二维码

引用本文

分享

微信扫一扫：分享

文章指标

历史

文章二维码