﻿ 基于全域市场数据感知的终端客户推荐
 计算机系统应用  2020, Vol. 29 Issue (5): 136-143 PDF

Terminal Customer Recommendation Based on Global Market Data Perception
HE Li-Li, ZHANG Xing
School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou 310018, China
Foundation item: Major Program of Science and Technology Bureau, Zhejiang Province (2015C03001)
Abstract: The end-customer recommendation system is an effective tool for large-scale manufacturer terminal marketing. How to design a search method for finding the best target customer by collecting global market data in the Internet+ environment has become a challenge. To solve this problem, This study proposes a terminal customer recommendation method based on global market data perception (GMF). That is to use the idea of global analysis to preprocess the customer data nationwide, establish a comprehensive, multi-angle evaluation index, and obtain the target customer value. Then, through the method of domain subspace decomposition, the data is decomposed and analyzed in the domain subspace, and the customer evaluation criteria in a certain region are obtained. The analysis results of the two are effectively merged, and the similarity of the coupled objects is calculated, and the most similar TopN data is used as the best target customer result set. The experimental results on the data set generated by the large-scale manufacturer marketing activities show that the proposed algorithm is significantly better than the current mainstream collaborative filtering algorithm.
Key words: global market     value of customer     matrix decomposition     coupled object similarity     recommendation algorithm

1 协同过滤推荐算法

1.1 传统的矩阵分解方法

 ${R_{u,i}} \approx \hat R = {P^{\rm T}}Q$ (1)

 $L{\rm{ = }}\frac{1}{2}\sum\limits_{u = 1}^m {\sum\limits_{i = 1}^n {{{({R_{u,i}} - {{\hat R}_{u,i}})}^2}} } + \frac{{{\lambda _1}}}{2}\left\| {{P_u}} \right\|_F^2 + \frac{{{\lambda _2}}}{2}\left\| {{Q_i}} \right\|_F^2$ (2)

1.2 耦合对象相似度

 $\cos (u,u') = \sum\limits_{i = 1}^n {\delta _i^A({u_i},u_i')}$ (3)

 $\delta _i^A(u,u') = \delta _i^{\rm Ia}({u_i},u_i')*\delta _i^{\rm Ie}({u_i},u_i')$ (4)

 $\delta _i^{\rm Ie}({u_i},u_i') = \sum\limits_{k = 1,k \ne i}^n {{a_k}{\delta _{i\left| k \right.}}({u_i},u_i')}$ (5)

 ${\delta _{i\left| k \right.}}({u_i},u_i'){\rm{ = }}\sum\limits_{w \in \cap } {min \left\{ {{P_{k\left| i \right.}}(\left\{ w \right\}{u_i}),{P_{k\left| i \right.}}(\left\{ w \right\}\left| {u_i'} \right.)} \right\}}$ (6)

 ${P_{k|i}}(\{ w\} |{u_i}){\rm{ = }}\frac{{\left| {{{\rm{g}}_k}(w) \cap {g_j}({u_i})} \right|}}{{{g_j}({u_i})}}$ (7)

2 基于全域市场数据感知的推荐算法

2.1 全域用户项目评分矩阵

1) 获取客户最近一年内消费时间R, 消费频率F, 消费金额M这3个行为指标;

2) R, F, M按照其对大型制造商收益的贡献值大小将数据区间从高到低分别用5, 4, 3, 2, 1进行赋值;

3) 采用z-score标准化(zero-mean normalization)对RFM模型的指标数据进行标准化处理;

4) 利用层次分析法(Analytic Hierarchy Process, AHP)对RFM模型的指标权重进行评估;

5) RFM模型中在已知R, F, M3个指标权重分别为a, b, c的情况下, 计算客户价值 ${v_u}$ :

 ${v_u} = a * R + b*F + c*M$ (8)

 ${r_{u,i}}{\rm{ = }}\left\{ {\begin{array}{*{20}{c}} {\mu ,\;{\text{客户已购买产品}}}\\ {0,\;{\text{客户未购买产品}}} \end{array}} \right.$ (9)

1) 计算全国客户总的购买量 ${N_g}$ , 省内客户的购买量 ${N_s}$ , 市内客户的购买量 ${N_c}$ , 片区域内客户的购买量 ${N_p}$ , 客户 $u$ 的购买量 ${N_u}$ ;

2) 计算省区域特色系数 ${\mu _s}{\rm{ = }}{N_s}{\rm{/}}{N_g}$ , 市区域特色系数 ${\mu _c}{\rm{ = }}{N_c}{\rm{/}}{N_s}$ , 片区域特色系数 ${\mu _p}{\rm{ = }}{N_p}{\rm{/}}{N_c}$ 和客户 $u$ 的购买量系数 ${\mu _u}{\rm{ = }}{N_u}{\rm{/}}{N_p}$ ;

3) 采用z-score标准化对 ${\mu _s}$ , ${\mu _c}$ , ${\mu _p}$ ${\mu _u}$ 分别进行标准化处理, 得到 ${\hat \mu _s}$ , ${\hat \mu _c}$ , ${\hat \mu _p}$ ${\hat \mu _u}$ ;

4) 计算区域特色系数 $\mu {\rm{ = }}{\hat \mu _s} + {\hat \mu _c} + {\hat \mu _p} + {\hat \mu _u}$ .

 $r_{_{u,i}}' = (1 - \alpha )*{r_{u,i}} + \alpha *{v_u}$ (10)

 $R_{_{u,i}}^{''} = \frac{{({R_{\max }} - {R_{\min }}) * (R_{u,i}' - {R_{\min }})}}{{({R_{\max }} - {R_{\min }}) + {R_{\min }}}}$ (11)
2.2 基于全域市场数据感知的推荐算法框阵

 $\frac{1}{2}\mathop {min }\limits_{P,Q} \sum\limits_{u = 1}^n {\sum\limits_{i = 1}^m {{I_{u,i}}{{({R_{u,i}} - P_{_u}^{\rm T}{Q_i})}^2}} }$ (12)

 $\begin{split} &\frac{\beta }{2}\sum\limits_{u = 1}^N {\sum\limits_{u' = 1}^N {\cos (u,u')\left\| {{p_u} - {p_{u'}}} \right\|_F^2} } +\frac{{{\lambda _1}}}{2}\left\| P \right\|_F^2\\ &+ \frac{{{\lambda _2}}}{2}\left\| Q \right\|_F^2 + \frac{{{\lambda _3}}}{2}{\left\| b \right\|^2} \end{split}$ (13)

3 实验分析

3.1 实验数据

3.2 评价指标

 $MAE = \frac{1}{{\left| T \right|}}\sum\limits_{u,i \in T} {\left| {{r_{u,i}} - {{\hat r}_{u,i}}} \right|}$ (14)
 $RMS\!E = \sqrt {\frac{1}{{\left| T \right|}}\sum\limits_{u,i \in T} {{{\left| {{r_{u,i}} - {{\hat r}_{u,i}}} \right|}^2}} }$ (15)

3.3 实验过程与结果分析 3.3.1 RFM客户聚类实验

3.3.2 区域特色系数的影响实验

 图 1 加入区域特色系数前后的客户比

3.3.3 与经典方法的对比实验

(1) PMF[8]: 该方法仅考虑用户对物品的评分信息进行概率矩阵分解预测缺失项.

(2) MMMF[9]: 该方法引入计算序数回归排序损失函数进行矩阵分解模型预测缺失项.

(3) NMF[10]: 该方法限定在训练学习过程中隐特征向量更新仅包含非负项进行矩阵分解预测缺失项.

(4) RSVD[11]: 该方法基于SVD模型中引入正则化项进行奇异值分解预测缺失项.

(5) BPMF[12]: 该方法使用马尔科夫链蒙特卡洛方法进行近似推理预测缺失项.

(6) SVD++[13]: 该方法同时考虑偏置信息以及用户隐式反馈信息进行矩阵分解预测缺失项.

(1) 实验参数K的影响

 图 2 K对MAE的影响

 图 3 K对RMSE的影响

K小于15时, 推荐算法随着K的增加其质量不断提高, 但当K大于15以后继续增加K的值推荐算法的质量不再提高. 这说明隐特征数量的增加会在一定范围内提高推荐算法质量, 一旦超过某一阈值以后可能就不会再提高推荐算法的质量. 造成这一现象的原因可能是本文所选的数据集在K大于15以后用户和项目的隐特征向量已经能够很好的刻画其隐特征, 而继续增加K的值反而会因为噪音的影响降低了推荐算法的质量.

(2) 实验参数 $\beta$ 的影响

$\beta$ 控制着GMF算法中的客户的属性信息对学习隐特征向量的影响. 若 $\beta {\rm{ = }}1$ 时, 客户隐特征向量将直接与它邻居的特征向量相似, 忽略了评分数据的影响; 若 $\beta {\rm{ = 0}}$ 时, 仅使用评分信息进行矩阵分解预测缺失评分. 本文在大型制造商数据集上, 设置隐特征向量维度K为10, $\beta$ 的值从0到1并以步长0.1的间隔逐渐增加. 实验结果如图4图5所示, 随着 $\beta$ 值的增长, MAERMSE的值先下降后递增. 这说明 $\beta$ 的值一旦超过某一阈值后, 推荐算法的性能就会下降. 也就是说, 不依赖或完全依赖客户属性信息都会使得推荐系统性能下降, 推荐结果不可靠.

 图 4 $\beta$ 对MAE的影响

 图 5 $\beta$ 对RMSE的影响

(3) 冷启动对推荐系统性能的影响

 图 6 评分数量对MAE的影响

 图 7 评分数量对RMSE的影响

4 结论与展望

 [1] Wang Y, Deng JZ, Gao J, et al. A hybrid user similarity model for collaborative filtering. Information Sciences, 2017, 418–419: 102-118. DOI:10.1016/j.ins.2017.08.008 [2] Lü LY, Medo M, Yeung CH, et al. Recommender systems. Physics Reports, 2012, 519(1): 1-49. DOI:10.1016/j.physrep.2012.02.006 [3] Lika B, Kolomvatsos K, Hadjiefthymiades S. Facing the cold start problem in recommender systems. Expert Systems with Applications, 2014, 41(4): 2065-2073. DOI:10.1016/j.eswa.2013.09.005 [4] Ji K, Shen H. Addressing cold-start: Scalable recommen-dation with tags and keywords. Knowledge-Based Systems, 2015, 83: 42-50. DOI:10.1016/j.knosys.2015.03.008 [5] Forsati R, Mahdavi M, Shamsfard M, et al. Matrix factorization with explicit trust and distrust side information for improved social recommendation. ACM Transactions on Information Systems (TOIS), 2014, 32(4): 1-38. [6] Koren Y, Bell R, Volinsky C. Matrix factorization techniques for recommender systems. Computer, 2009, 42(8): 30-37. DOI:10.1109/MC.2009.263 [7] Hernando A, Bobadilla J, Ortega F. A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model. Knowledge-Based Systems, 2016, 97: 188-202. DOI:10.1016/j.knosys.2015.12.018 [8] Salakhutdinov R, Mnih A. Probabilistic matrix factorization. Proceedings of the 20th International Conference on Neural Information Processing Systems. Vancouver, BC, Canada. 2008. 1257–1264. [9] Weimer M, Karatzoglou A, Smola A. Improving maximum margin matrix factorization. Machine Learning, 2008, 72(3): 263-276. DOI:10.1007/s10994-008-5073-7 [10] Huang KJ, Sidiropoulos ND, Swami A. Non-negative matrix factorization revisited: Uniqueness and algorithm for symmetric decomposition. IEEE Transactions on Signal Processing, 2014, 62(1): 211-224. DOI:10.1109/TSP.2013.2285514 [11] Nguyen J, Zhu M. Content-boosted matrix factorization techniques for recommender systems. Statistical Analysis and Data Mining, 2013, 6(4): 286-301. DOI:10.1002/sam.11184 [12] Salakhutdinov R, Mnih A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. Proceedings of the 25th International Conference on Machine Learning. New York, NY, USA. 2008. 880–887. [13] Koren Y. Factorization meets the neighborhood: A multifaceted collaborative filtering model. Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA. 2008. 426–434. [14] Yang B, Lei Y, Liu JM, et al. Social collaborative filtering by trust. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1633-1647. DOI:10.1109/TPAMI.2016.2605085 [15] Gurini DF, Gasparetti F, Micarelli A, et al. Temporal people-to-people recommendation on social networks with sentiment-based matrix factorization. Future Generation Computer Systems, 2018, 78: 430-439. DOI:10.1016/j.future.2017.03.020 [16] Gan GQ, Ma CQ, Wu JH. Data Clustering: Theory, Algorithms, and Applications. Philadelphia: SIAM, 2007. [17] Yu YH, Wang C, Wang H, et al. Attributes coupling based matrix factorization for item recommendation. Applied Intelligence, 2017, 46(3): 521-533. DOI:10.1007/s10489-016-0841-8 [18] Lian DF, Zheng K, Ge Y, et al. GeoMF++: Scalable location recommendation via joint geographical modeling and matrix factorization. ACM Transactions on Information Systems (TOIS), 2018, 36(3): 33. [19] Chiang WY. Identifying high-value airlines customers for strategies of online marketing systems: An empirical case in Taiwan. Kybernetes, 2018, 47(3): 525-538. DOI:10.1108/K-12-2016-0348 [20] Pirasteh P, Hwang D, Jung JJ. Exploiting matrix factorization to asymmetric user similarities in recommendation systems. Knowledge-Based Systems, 2015, 83: 51-57. DOI:10.1016/j.knosys.2015.03.006