###
计算机系统应用英文版:2022,31(6):315-323
←前一篇   |   后一篇→
本文二维码信息
码上扫一扫!
加入情感分析的Stacking模型在网络剧播放量预测中的应用
(南京航空航天大学 经济与管理学院, 南京 211106)
Predicting Network Drama Broadcast Volume Based on Sentiment Analysis and Stacking Model
(College of Economics and Management, Nanjing University of Aeronautics and Astronautics, Nanjing 211106, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 442次   下载 916
Received:August 28, 2021    Revised:September 26, 2021
中文摘要: 随着网络剧近年来的飞速发展, 对播放量的研究也逐渐受到关注. 网络剧播放量反映了网络剧的口碑和受欢迎程度, 这与制作方和投资方的收益密切相关. 但目前的研究尚未考虑观众评论的情感态度对播放量的影响, 并且预测模型也较为简单, 预测精度有待进一步提高. 本文在对用户评论进行情感分析的基础上, 构建Stacking集成学习模型对我国网络剧的播放量进行预测. 首先基于SO-PMI算法构建网络剧领域情感词典, 并结合基础情感词典以及点赞数权重计算出评论情感得分, 加入预测指标体系中; 接着以随机森林(random forest, RF), GBDT, XGBoost以及LightGBM为基学习器, MLR为元学习器, 构建Stacking网络剧播放量分阶段的预测模型, 使用当前数据对下一周的播放量进行预测; 最后进行模型比较分析, 并得出预测变量的重要性分值. 实验结果显示, 本文所构建的模型判定系数R方值达到了0.89, 高于基学习器单独的模型预测R方值 (最高0.84)以及未加入情感得分变量的Stacking模型预测R方值 (0.81). 可以得出加入情感得分变量后, 本文构建的Stacking集成学习模型在一定程度上可以提高网络剧播放量的预测精度.
Abstract:With the rapid development of network dramas in recent years, the research on broadcast volume has gradually attracted attention. Broadcast volume reflects the reputation and popularity of a network drama, which are closely related to the profits of producers and investors. However, current research rarely considers the impact of the sentiments in viewers’ comments on broadcast volume, and the forecasting models are simple. Consequently, the accuracy of prediction needs to be further improved. After a sentiment analysis of users’ comments, we construct a stacking ensemble learning model to predict the broadcast volume of network dramas in China. Using the SO-PMI (semantic orientation-pointwise mutual information) algorithm, we build a sentiment dictionary in the network drama domain. A basic sentiment dictionary and the number of likes are also taken into account to calculate the comment sentiment scores, which are then added into the prediction index system. With random forest (RF), GBDT (gradient boosting decision tree), XGBoost (extreme gradient boosting), and LightGBM (light gradient boosting machine) as base learners and MLR as a meta learner, a stacking prediction model is constructed to predict the broadcast volume of a network drama in stages. The broadcast volume of the next week can be forecasted with data of the current week. Finally, the results of different models are compared and analyzed, and the importance scores of predictive variables are obtained. The experimental results show that the determination coefficient R-square of the proposed model reaches 0.89, which is higher than that of a single base learner (maximum 0.84) as well as that of the stacking model without sentiment score variables (0.81). It can be concluded that with sentiment score variables, the proposed stacking ensemble learning model delivers better prediction accuracy on the broadcast volume of network dramas than that of traditional models.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(72001106)
引用文本:
李明珠,米传民,肖琳,许乃元.加入情感分析的Stacking模型在网络剧播放量预测中的应用.计算机系统应用,2022,31(6):315-323
LI Ming-Zhu,MI Chuan-Min,XIAO Lin,XU Nai-Yuan.Predicting Network Drama Broadcast Volume Based on Sentiment Analysis and Stacking Model.COMPUTER SYSTEMS APPLICATIONS,2022,31(6):315-323