本文已被:浏览 1562次 下载 1990次
Received:May 21, 2018 Revised:June 15, 2018
Received:May 21, 2018 Revised:June 15, 2018
中文摘要: 针对大规模语料手动标注困难的问题,提出利用概率潜在语义分析(PLSA)模型的新闻评论自动标注方法.利用PLSA计算获得语料集的"文档-主题"和"词语-主题"概率矩阵;基于情感本体库和"词语-主题"概率矩阵,认为某一类情绪词汇出现的概率最高的主题与词汇的情绪类别相同,对主题进行情绪类别标注;最后,基于"文档-主题"概率矩阵,认为出现在某一主题概率最高的文档与主题的情绪类别相同,通过"词汇-主题-文档"三者的关系,达到自动标注的效果.实验结果表明,本文提出的方法准确率可达到90%以上.
Abstract:In order to solve the problem of manually annotating large-scale corpus, this study, based on the model of Probabilistic Latent Semantic Analysis (PLSA), proposed a method of automatic emotional annotation for news comments. First of all, the "doc-topic" and "word-topic" probability matrixes were computed by PLSA model. Then, drawing upon the "word-topic" together with the ontology lexicon, the emotional categories of the topics were annotated, with the presupposition that the emotional category of words is similar to those of words within the topic which occurs most frequently. Finally, the automatic annotation was made via the "doc-topic", with the assumption that the emotional category of topics is equivalent to those of topics within the document which occurs most frequently. The experimental results showed that the accurate rate of the method proposed by this study reached about 90%.
文章编号: 中图分类号: 文献标志码:
基金项目:教育部人文社会科学研究项目(14YJA740011);广州市哲学社会科学“十三五”规划2018年度课题(2018GZQN27);广东省科技计划项目(2017A040406025);国家自然科学基金(61877013)
引用文本:
林江豪,顾也力,周咏梅,阳爱民.基于PLSA的新闻评论情绪类别自动标注方法.计算机系统应用,2019,28(1):207-211
LIN Jiang-Hao,GU Ye-Li,ZHOU Yong-Mei,YANG Ai-Min.Automatic Annotation of News Comments Emotion Based on PLSA.COMPUTER SYSTEMS APPLICATIONS,2019,28(1):207-211
林江豪,顾也力,周咏梅,阳爱民.基于PLSA的新闻评论情绪类别自动标注方法.计算机系统应用,2019,28(1):207-211
LIN Jiang-Hao,GU Ye-Li,ZHOU Yong-Mei,YANG Ai-Min.Automatic Annotation of News Comments Emotion Based on PLSA.COMPUTER SYSTEMS APPLICATIONS,2019,28(1):207-211