基于多粒度特征和混合算法的文档推荐系统
作者:
基金项目:

北京市科技计划项目(D171100003417002)


Document Recommendation System Based on Multi-Granularity Features and Hybrid Algorithms
Author:
  • 摘要
  • | |
  • 访问统计
  • |
  • 参考文献 [28]
  • |
  • 相似文献 [20]
  • | | |
  • 文章评论
    摘要:

    文库系统对信息的传播利用有着重要的作用,但在文库系统中出现信息过载问题后,数据的利用率会大大降低.针对该问题提出了一种基于多粒度特征和混合算法的文档推荐系统,系统在短语和词语两个粒度上对用户兴趣及文档特征进行建模,综合基于内容推荐算法及协同过滤算法,为用户生成兴趣列表.系统测试数据表明,系统在准确率、召回率、覆盖率、新颖度等指标上均有较为优异的表现,其为用户推荐的文档较符合用户实际偏好,有助于提升文库系统的数据利用率,改善用户体验.

    Abstract:

    Document System plays an important role in information dissemination and utilization. However, with the emergence of information overload, the utilization rate of data would greatly decrease. To solve this problem, a document recommendation system based on multi-granularity features and Hybrid Algorithms is proposed. User interest and document feature models are established on both phrase and term granularities. Then, the system generates recommendation lists for users based on the combination of content-based and collaborative-filtering algorithms. The tests based on authentic data demonstrate that the document recommendation system has a better performance on precision, recall rate, coverage rate and novelty. The recommendation lists are more in line with users' interests. This helps to increase the utilization rate of data and improves user experience with better performance.

    参考文献
    [1] 彭菲菲, 钱旭. 基于用户关注度的个性化新闻推荐系统. 计算机应用研究, 2012, 29(3): 1005-1007.
    [2] Li LH, Chu W, Langford J, et al. A contextual-bandit approach to personalized news article recommendation. Proceedings of the 19th International Conference on World Wide Web. Raleigh, NC, USA. 2010. 661-670.
    [3] Vatturi PK, Geyer W, Dugan C, et al. Tag-based filtering for personalized bookmark recommendations. Proceedings of the 17th ACM Conference on Information and Knowledge Management. Napa Valley, CA, USA. 2008. 1395-1396.
    [4] Klinkenberg R. Learning drifting concepts: Example selection vs. example weighting. Intelligent Data Analysis, 2004, 8(3): 281-300.
    [5] Koychev I, Schwab I. Adaptation to drifting user's interests. Proceedings of ECML 2000 Workshop: Machine Learning in New Information Age. Barcelona, Spain. 2000. 39-46.
    [6] 单蓉. 用户兴趣模型的更新与遗忘机制研究. 微型电脑应用, 2011, 27(7): 10-11.
    [7] Bollacker KD, Lawrence S, Giles CL. Discovering relevant scientific literature on the Web. IEEE Intelligent Systems and Their Applications, 2000, 15(2): 42-47. [DOI:10.1109/5254.850826]
    [8] Ricci F, Rokach L, Shapira B, et al. Recommender Systems Handbook. Berlin, Germany: Springer, 2011: 1-842.
    [9] 王立才, 孟祥武, 张玉洁. 上下文感知推荐系统. 软件学报, 2012, 23(1): 1-20.
    [10] 许海玲, 吴潇, 李晓东, 等. 互联网推荐系统比较研究. 软件学报, 2009, 20(2): 350-362.
    [11] van den Oord A, Dieleman S, Schrauwen B. Deep content-based music recommendation. Advances in Neural Information Processing Systems 26. Lake Tahoe, NV, USA. 2013. 2643-2651.
    [12] Lops P, de Gemmis M, Semeraro G, et al. Content-based and collaborative techniques for tag recommendation: An empirical evaluation. Journal of Intelligent Information Systems, 2013, 40(1): 41-61. [DOI:10.1007/s10844-012-0215-6]
    [13] Achakulvisut T, Acuna DE, Ruangrong T, et al. Science concierge: A fast content-based recommendation system for scientific publications. PLoS One, 2016, 11(7): e0158423. [DOI:10.1371/journal.pone.0158423]
    [14] Philip S, Shola PB, John AO. Application of content-based approach in research paper recommendation system for a digital library. International Journal of Advances Computer Science and Applications, 2014, 5(10): 37-40.
    [15] Jeong B, Lee J, Cho H. An iterative semi-explicit rating method for building collaborative recommender systems. Expert Systems with Applications, 2009, 36(3): 6181-6186. [DOI:10.1016/j.eswa.2008.07.085]
    [16] de Campos LM, Fernández-Luna JM, Huete JF, et al. Combining content-based and collaborative recommendations: A hybrid approach based on Bayesian networks. International Journal of Approximate Reasoning, 2010, 51(7): 785-799. [DOI:10.1016/j.ijar.2010.04.001]
    [17] 杨家慧,刘方爱.基于巴氏系数和Jaccard系数的协同过滤算法.计算机应用,2016,36(7):2006-2010
    Yang JH, Liu FA. Collaborative filtering algorithm based on Bhattacharyya coefficient and Jaccard coefficient. Journal of Computer Applications, 2016, 36(7): 2006-2010.
    [18] Konstan JA, Miller BN, Maltz D, et al. GroupLens: Applying collaborative filtering to Usenet news. Communications of the ACM, 1997, 40(3): 77-87. [DOI:10.1145/245108.245126]
    [19] Zhao ZD, Shang MS. User-based collaborative-filtering recommendation algorithms on hadoop. Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining. Phuket, Thailand. 2010. 478-481.
    [20] 黄裕洋, 金远平. 一种综合用户和项目因素的协同过滤推荐算法. 东南大学学报(自然科学版), 2010, 40(5): 917-921.
    [21] Lee SK, Cho YH, Kim SH. Collaborative filtering with ordinal scale-based implicit ratings for mobile music recommendations. Information Sciences, 2010, 180(11): 2142-2155. [DOI:10.1016/j.ins.2010.02.004]
    [22] Lika B, Kolomvatsos K, Hadjiefthymiades S. Facing the cold start problem in recommender systems. Expert Systems with Applications, 2014, 41(4): 2065-2073. [DOI:10.1016/j.eswa.2013.09.005]
    [23] Fernández-Tobías I, Braunhofer M, Elahi M, et al. Alleviating the new user problem in collaborative filtering by exploiting personality information. User Modeling User-Adapted Interaction, 2016, 26(2-3): 221-255. [DOI:10.1007/s11257-016-9172-z]
    [24] 邓爱林, 朱扬勇, 施伯乐. 基于项目评分预测的协同过滤推荐算法. 软件学报, 2003, 14(9): 1621-1628.
    [25] 上海林原信息科技有限公司. HanLP: Han language processing. http://hanlp.linrunsoft.com/. [2015-04-02].
    [26] Lu ZQ, Dou ZC, Lian JX, et al. Content-based collaborative filtering for news topic recommendation. Proceedings of the 29th AAAI Conference on Artificial Intelligence and the 27th Innovative Applications of Artificial Intelligence Conference. Austin, TX, USA. 2015. 217-223.
    [27] 项亮. 推荐系统实践. 北京: 人民邮电出版社, 2012.
    引证文献
    网友评论
    网友评论
    分享到微博
    发 布
引用本文

邬登峰,白琳,王涛,李慧,许舒人.基于多粒度特征和混合算法的文档推荐系统.计算机系统应用,2018,27(3):9-17

复制
分享
文章指标
  • 点击次数:2587
  • 下载次数: 3587
  • HTML阅读次数: 1163
  • 引用次数: 0
历史
  • 收稿日期:2017-06-12
  • 最后修改日期:2017-06-27
  • 在线发布日期: 2018-01-25
文章二维码
您是第11124360位访问者
版权所有:中国科学院软件研究所 京ICP备05046678号-3
地址:北京海淀区中关村南四街4号 中科院软件园区 7号楼305房间,邮政编码:100190
电话:010-62661041 传真: Email:csa (a) iscas.ac.cn
技术支持:北京勤云科技发展有限公司

京公网安备 11040202500063号