计算机系统应用  2020, Vol. 29 Issue (8): 224-229 PDF

1. 中国科学院大学 计算机控制与工程学院, 北京 100049;
2. 中国科学院 沈阳计算技术研究所, 沈阳 110168

User Clustering Collaborative Filtering Recommendation Algorithm Combined with Trust Relationship
MENG Han1,2, GAO Cen2, WANG Song2, ZHANG Lin-Lin2, LIU Nian2
1. School of Computer and Control Engineering, University of Chinese Academy of Sciences, Beijing 100049, China;
2. Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China
Abstract: In the traditional collaborative filtering recommendation algorithm, similarity calculation is the core of the algorithm. However, the previous calculation method is too dependent on the user’s score, does not consider the user’s own attributes and trust relationship, and does not distinguish malicious users. In order to solve the appeal problem, this study introduces an improved new trust relationship measurement method into similarity calculation. This new method not only considers the influence of malicious users, but also combines the properties of users effectively. In addition, the study also improves the similarity algorithm on the hot issues. The algorithm finally uses the initial user clustering to get the adjacent users, effectively eliminating the cold start and data sparsity. In the experimental part, it can be proved that the proposed algorithm can effectively improve the recommendation accuracy by comparing with other algorithms.
Key words: collaborative filtering     trust relationship     similarity algorithm     user clustering     adjacent users

1 协同过滤推荐算法

 ${{sim}}(a,b) = \dfrac{{\displaystyle\sum\nolimits_{i \in {I_{a,b}}} {({R_{a,i}} - \overline {{R_a}} )({R_{b,i}} - \overline {{R_b}} )} }}{{\sqrt {\displaystyle\sum\nolimits_{i \in {I_{a,b}}} {{{({R_{a,i}} - \overline {{R_a}} )}^2}} } \sqrt {{{\displaystyle\sum\nolimits_{i \in {I_{a,b}}} {({R_{b,i}} - \overline {{R_b}} )} }^2}} }}$ (1)

 ${P_{a,j}} = \overline {{R_a}} + \dfrac{{\displaystyle\sum\nolimits_{b \in UNN(a)} {sim(a,b)({R_{b,j}} - {R_a})} }}{{\displaystyle\sum\nolimits_{b \in UNN(a)} {\left| {sim(a,b)} \right|} }}$ (2)

2 信任关系

2.1 直接信任 2.1.1 传统的信任关系模型

 $DTrust(a,b) = Init(a,b)\dfrac{{{C_s}}}{{{C_a}}}$ (3)
 $Init(a,b) = \dfrac{{\min ({I_a} \cap {I_b},t)}}{t}$ (4)
2.1.2 改进的信任关系模型

 $STrust(b) = \dfrac{{\min (C({R_b} = 1,2),C({R_b} = 4,5)) + 1}}{{\max (C({R_b} = 1,2),C({R_b} = 4,5)) + 1}}$ (5)

 $DTrust(a,b) = Init(a,b)\dfrac{{{C_s}}}{{{C_a}}} \times STrust(b)$ (6)

 $S(a,b) = \left\{ \begin{array}{l} 0,\quad {S_a} \ne {S_b} \\ 1,\quad {S_a} = {S_b} \\ \end{array} \right.$ (7)
 $A(a,b) = \left\{ \begin{array}{l} \quad \;\;1,\;\;\quad \left| {{A_a} - {A_b}} \right| \le 5 \\ \dfrac{5}{{\left| {{A_a} - {A_b}} \right|}},\quad \left| {{A_a} - {A_b}} \right| > 5 \\ \end{array} \right.$ (8)
 $O(a,b) = \left\{ \begin{array}{l} \dfrac{{{H_{a,b}}}}{{Height}},\quad a \ne b \\ \quad 1,\quad \quad \;a = b \\ \end{array} \right.$ (9)

 $ATrust(a,b) = 0.5S(a,b) + 0.1A(a,b) + 0.4O(a,b)$ (10)

 $\begin{split} DTrust(a,b) = Init(a,b)\dfrac{{{C_s}}}{{{C_a}}} \times STrust(b) + ATrust(a,b) \\ \end{split}$ (11)
2.2 间接信任

 ${\alpha _i} = \dfrac{{d - d{p_i}(a,b) + 1}}{d}$ (12)

 $\begin{split} ITrust{p_i}({a_i},{b_i}) = & {\alpha _i} \times (DTrust({a_i},{c_i}_1) \times DTrust({c_i}_1,{c_i}_2) \cdots\\ & DTrust({c_i}_p,{b_i})) \\ \end{split}$ (13)

 $ITrust(a,b) = \dfrac{{\displaystyle\sum\nolimits_{i \in n} {ITrust{p_i}({a_i},{b_i})} }}{n}$ (14)
2.3 融合信任模型

 $\lambda {\rm{ = }}\dfrac{{DTrust(a,b)}}{{DTust(a,b) + ITrust(a,b)}}$ (15)

 $Trust(a,b) = \lambda DTrust(a,b) + (1 - \lambda )ITrust(a,b)$ (16)
3 相似度算法改进 3.1 融合热点的皮尔逊相关系数

 $MSim(a,b) = \dfrac{{\displaystyle\sum\nolimits_{i \in {I_{a,b}}} {\left({\frac{{\overline C }}{{{C_i}}}}\right)({R_{a,i}} - \overline {{R_a}} )({R_{b,i}} - \overline {{R_b}} )} }}{{\sqrt {\displaystyle\sum\nolimits_{i \in {I_{a,b}}} {{{({R_{a,i}} - \overline {{R_a}} )}^2}} } \sqrt {\displaystyle\sum\nolimits_{i \in {I_{a,b}}} {{{({R_{b,i}} - \overline {{R_b}} )}^2}} } }}$ (17)

3.2 最终的用户相似度公式

 $USim(a,b) = \mu Trust(a,b) + (1 - \mu )MSim(a,b)$ (18)
4 算法过程描述 4.1 用户聚类迭代算法

4.2 推荐算法整体流程

 ${P_{a,i}} = \overline {{R_a}} + \dfrac{{\displaystyle\sum\nolimits_{b \in UKN(a)} {USim(a,b) \times ({R_{b,i}} - \overline {{R_b}} )} }}{{\displaystyle\sum\nolimits_{b \in UKN(a)} {USim(a,b)} }}$ (19)

 图 1 算法整体流程图

(1)首先用K-mean聚类算法对初始的用户集合进行聚类, 得到初始用户聚类 $\scriptstyle UC = \left\{ {U{C_1},U{C_2},\cdots,U{C_k}} \right\}$ 和初始用户聚类中心为 $\scriptstyle C = \left\{ {{C_1},{C_2},\cdots,{C_k}} \right\}$

(2) //迭代得到最佳用户聚类集合

repeat

for each user UiU

for each user cluster CjC

计算Ui与聚类中心Cj的相似度USim(Ui, Cj)

end for

$\scriptstyle{{USim}}\left( {{U_{i,}}{C_t}} \right) = {\rm{max}}\left\{ {{\rm{sim}}\left( {{U_{i,}}{C_1}} \right),\cdots,{\rm{sim}}\left( {{U_{i,}}{C_k}} \right)} \right\}$

Ui所属聚类UCs=UCs-Ui

UCt=UCt+Ui

end for

for each UCiUC

更新聚类中心Ci

until聚类簇中元素不再变化或者达到设定迭代次数

5 实验 5.1 实验数据

5.2 实验评分标准

 $MAE = \dfrac{{\displaystyle\sum\nolimits_{i = 1}^n {{P_i} - {R_i}} }}{n}$ (20)

5.3 实验结果与分析

 图 2 不同聚类次数对应的MAE值

 图 3 不同迭代次数对应的MAE值

 图 4 不同权重因子对应的MAE值

 图 5 不同算法之间的比较

6 结论

 [1] 王绍卿, 李鑫鑫, 孙福振, 等. 个性化新闻推荐技术研究综述. 计算机科学与探索, 2020, 14(1): 18–29. [2] Lokhande A, Jain P. Hybrid collaborative filtering model using hierarchical clustering and PCA. Proceedings of Recent Advances in Interdisciplinary Trends in Engineering & Applications (RAITEA). 2019. [3] Bobadilla J, Ortega F, Hernando A, et al. A collaborative filtering approach to mitigate the new user cold start problem. Knowledge-Based Systems, 2012, 26: 225-238. DOI:10.1016/j.knosys.2011.07.021 [4] Zhou K, Yang SH, Zha HY. Functional matrix factorizations for cold-start recommendation. Proceedings of the 34th international ACM SIGIR conference on Research and Development in Information Retrieval. Beijing, China. 2011. 315–324. [5] Yang B, Lei Y, Liu JM, et al. Social collaborative filtering by trust. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1633-1647. DOI:10.1109/TPAMI.2016.2605085 [6] Xue FL, Liu JL. Improving collaborative filtering recommendation based on trust relationship among users. Data Analysis and Knowledge Discovery, 2017(7): 90-99. [7] Du YP, Du XY, Huang L. Improve the collaborative filtering recommender system performance by trust network construction. Chinese Journal of Electronics, 2016, 25(3): 418-423. DOI:10.1049/cje.2016.05.005 [8] 刘智捷. 基于信任关系和兴趣变化的协同过滤算法研究[硕士学位论文]. 杭州: 杭州电子科技大学, 2017. [9] 蒋宗礼, 于莉. 基于用户特征的协同过滤推荐算法. 计算机系统应用, 2019, 28(8): 190-196. DOI:10.15888/j.cnki.csa.007002 [10] 刘智捷, 徐小良, 王宇翔. 基于融合信任关系的协同过滤推荐算法. 自然科学版, 2018, 38(3): 44-48.