###
计算机系统应用:2019,28(3):126-132
本文二维码信息
码上扫一扫!
基于TF-IDF和改进BP神经网络的社交平台垃圾文本过滤
(安徽师范大学 计算机与信息学院, 芜湖 241000)
Social Platform Spam Filtering Based on TF-IDF and Optimized BP Neural Network
(School of Computer and Information, Anhui Normal University, Wuhu 241000, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 156次   下载 166
投稿时间:2018-09-27    修订日期:2018-10-23
中文摘要: 近年来,随着生活节奏的提高和互联网的迅速发展,人们更倾向于在众多社交平台上用短文本进行交流,进而可能有人通过发布垃圾文本妨碍人们的正常社交,扰乱网络的绿色环境.为了解决这个问题,我们提出了基于TF-IDF和改进BP神经网络的社交平台垃圾文本检测的方法.通过该方法,实现对社交平台上的垃圾文本过滤.首先,通过结巴分词和去停分词构造关键词数据集;其次,对文本表示的关键词向量运用计算各关键词的权重从而对文本向量进行降维,得到特征向量;最后,在此基础上,运用BP神经网络分类器对短文本进行分类,检测出垃圾文本并进行过滤.实验结果表明用该方法在1000维文本特征向量的情况下分类平均准确率达到了97.720%.
Abstract:In recent years, with the improvement of the pace of life and the rapid development of the Internet, people are more inclined to communicate with the short text on many social platforms, and then some people can disturb the network's green environment by releasing the spam texts to hinder the normal social intercourse. In order to solve this problem, we propose a method of spam text detection based on optimized BP neural network and social platform. Through this method, the spam text filtering on the social platform is realized. First of all, through the stuttering participle and to stop word to construct keyword data set. Secondly, the keyword vector of the text expression is used to compute the weights of each keyword so as to reduce the dimension of the text vector and obtain the eigenvector. Finally, based on this, the BP neural network classifier is used to classify the short texts, and the spam text is detected and filtered. The experimental results show that with this method, the average classification accuracy for the 1000 dimensional text feature vector reaches 97.720%.
文章编号:     中图分类号:    文献标志码:
基金项目:国家自然科学基金(61572036);安徽省社科规划项目(AHSKY2017D42);安徽省重大人文社科基金(SK2014ZD033)
引用文本:
王杨,王非凡,张舒宜,黄少芬,许闪闪,赵晨曦,赵传信.基于TF-IDF和改进BP神经网络的社交平台垃圾文本过滤.计算机系统应用,2019,28(3):126-132
WANG Yang,WANG Fei-Fan,ZHANG Shu-Yi,HUANG Shao-Fen,XU Shan-Shan,ZHAO Chen-Xi,ZHAO Chuan-Xin.Social Platform Spam Filtering Based on TF-IDF and Optimized BP Neural Network.COMPUTER SYSTEMS APPLICATIONS,2019,28(3):126-132

用微信扫一扫

用微信扫一扫