﻿ 基于群决策的P2P借贷信用风险评估
Group Decision-Making Method for Credit Risk Assessment in P2P Lending
JIANG Xue-Ying, QIN Jin
School of Management, University of Science and Technology of China, Hefei 230026, China
Foundation item: General Program of National Natural Science Foundation of China (71571175)
Abstract: In this study, we propose a combination approach based on group decision-making method, using random forest, neural network and GBDT as individual learners, to assess credit risk of borrowers in P2P lending. To validate the proposed method, two real-world datasets from PPDai.com and renrendai.com are examined. The results show that, compared with the individual learners, the proposed method has made a better performance.
1 引言

P2P借贷是指个人用户之间借助专业的互联网借贷平台进行的小额借贷交易. 近年来, P2P借贷行业在中国发展迅猛, 网贷之家的数据显示, 2017年国内P2P借贷交易额达2.8万亿元, 较2016年增长超过40%, 活跃投资人数达440万人. 为维持行业健康发展, 需进行有效的风险控制.

2 基于群决策的P2P借贷信用风险评估算法及模型构建

2.1 基于群决策的P2P借贷信用风险评估集成算法

$P_i^*$ 为受到其余个体学习器预测值的影响后, ${M_i}$ 的修正预测值, 取 $P_i^*$ 为所有个体学习器预测值的线性组合, 即:

 $P_i^* = \sum\limits_{j = 1}^N {{w_{ij}}{P_j}}$ (1)

 ${P^*} = WP$ (2)

 $\begin{array}{l} \pi W = \pi \\ \sum\nolimits_{i = 1}^N {{\pi _i} = 1} \end{array}$ (3)

 $R = \sum\limits_{i = 1}^N {{\pi _i}{P_i}}$ (4)

 ${U_{i|i}} = - {P_i}{\log _2}({P_i}) - (1 - {P_i}){\log _2}(1 - {P_i})$ (5)

${P_i}$ 趋向0或1时, 个体学习器 ${M_i}$ 对借款项目违约与否的判定清晰, 此时不确定度 ${U_{i|i}}$ 趋向于0. 当 ${P_i}$ 趋向0.5时, 个体学习器 ${M_i}$ 对借款项目违约与否的判定近似随机, 此时不确定度 ${U_{i|i}}$ 趋向于1. 因此, 局部不确定度 ${U_{i|i}}$ 能够反映个体学习器自身决策不确定的程度.

 ${U_{i|j}} = - {P_{i|j}}{\log _2}({P_{i|j}}) - (1 - {P_{i|j}}){\log _2}(1 - {P_{i|j}})$ (6)

${P_{i|j}}$ 表示的是个体学习器 ${M_i}$ 在个体学习器 ${M_j}$ 影响下的违约概率预测值, 取 ${P_{i|j}}$ ${P_i}$ ${P_j}$ 的线性组合, 即:

 ${P_{i|j}} = {P_j}{I_{i|j}} + {P_i}(1 - {I_{i|j}})$ (7)

${I_{i|j}}$ 为sigmoid函数, 即:

 ${I_{i|j}} = \frac{1}{{1 + {e^{ - (Ac{c_j} - Ac{c_i})}}}}$ (8)

 $\left\{\begin{array}{l} \min {z_i} = \sum\nolimits_{j = 1}^N {w_{ij}^2U_{i|j}^2} \\ \sum\nolimits_{j = 1}^N {{w_{ij}} = 1} \\ \end{array}\right.$ (9)

 ${L_i} = \sum\limits_{j = 1}^N {w_{ij}^2U_{i|j}^2 - \rho (\sum\limits_{j = 1}^N {{w_{ij}} - 1)} }$ (10)

${L_i}$ ${w_{ij}}$ 的偏导并令结果等于0, 结合 $\sum\nolimits_{j = 1}^N {{w_{ij}} = 1}$ , 得到 ${w_{ij}}$ 的表达式:

 ${w_{ij}} = \frac{1}{{U_{i|j}^2\sum\nolimits_{k = 1}^N {U_{i|k}^{ - 2}} }}$ (11)

2.2 基于群决策的P2P借贷信用风险评估模型构建过程

(1)分别运用N种机器学习算法, 在训练数据中训练出个体学习器 ${M_1},{M_2},\cdots,{M_N}$ , 并得到个体学习器的预测准确率 $Ac{c_1},\cdots,Ac{c_N}$ .

(2)应用个体学习器 ${M_1},{M_2},\cdots,{M_N}$ , 对测试集中借款项目的违约概率进行预测, 预测值为 ${P_1},$ $,{P_N}$ .

(3)运用式(5)求得个体学习器的局部不确定度,运用式(6)–(8)求得个体学习器的全局不确定度.

(4)运用式(11)求得权重 ${w_{ij}}(i=1,\cdots,N,j=1,\cdots,N)$ .

(5)将 ${w_{ij}}$ 代入式(3), 解得向量 $\pi$ 的值.

(6)运用式(4), 最终得到所有个体学习器的集成结果R.

2.3 个体学习器描述

2.3.1 随机森林

2.3.2 神经网络

2.3.3 梯度提升树

3 实验分析 3.1 实验数据及变量描述

3.2 实验结果

4 结语

