###
计算机系统应用英文版:2018,27(2):240-244
←前一篇   |   后一篇→
本文二维码信息
码上扫一扫!
基于多视图Tri-Training的微博用户性别判断
(1.南京烽火软件科技有限公司, 南京 210019;2.武汉邮电科学研究院 通信与信息专业, 武汉 430073)
Microblog User Gender Recognition with Multi-View and Tri-Training Learning
(1.FiberHome Telecommunication Technologies Co. Ltd., Nanjing 210019, China;2.Wuhan Research Insititute of Posts and Telecommunications, Wuhan 430073, China)
摘要
图/表
参考文献
相似文献
本文已被:浏览 1229次   下载 1449
Received:May 17, 2017    Revised:June 16, 2017
中文摘要: 互联网技术不断发展,新浪微博作为公开的网络社交平台拥有庞大的活跃用户. 然而由于用户数量庞大,且个人信息并不一定真实,造成训练样本打标困难. 本文采用了一种多视图tri-training的方法,构建三个不同的视图,利用这些视图中少量已打标样本和未打标样本不断重复互相训练三个不同的分类器,最后集成这三个分类器实现用户性别判断. 本文用真实用户数据进行实验,发现和单一视图分类器相比,使用多视图tri-training学习训练后的分类器准确性更好,且需要打标的样本更少.
Abstract:With the high pace of internet technology, microblog, an opening free social network, has an awful lot of active users. However, the number of sina microblog users is very large and the personal information is not always true, leading to the situation that it is hard to label the user's gender. In this study, multi-view and tri-training learning method are used to solve these problems. First three different views are constructed and three different classifiers are trained with a small number of labeled samples. And then three different classifiers are trained repeatedly by unlabeled samples. Finally, we integrate three classifiers into one to judge the user gender. We use the real user data and find that the classifier using the multi-view and tri-training learning is better than the performance of the single view classifier and needs less labeled data.
文章编号:     中图分类号:    文献标志码:
基金项目:
引用文本:
孙启蕴.基于多视图Tri-Training的微博用户性别判断.计算机系统应用,2018,27(2):240-244
SUN Qi-Yun.Microblog User Gender Recognition with Multi-View and Tri-Training Learning.COMPUTER SYSTEMS APPLICATIONS,2018,27(2):240-244