Prediction of Credit Default Based on Interpretable Integration Learning

doi:10.15888/j.cnki.csa.008220

AIPUB归智期刊联盟

WeChat

Mobile website

2025-7-26- 17

Home > Archive>Volume 30, Issue 12, 2021 >194-201. DOI:10.15888/j.cnki.csa.008220

PDF HTML XML Export Cite reminder

Prediction of Credit Default Based on Interpretable Integration Learning
DOI:
                        10.15888/j.cnki.csa.008220
                    
CSTR:
                        
                    
Author:
                        CAI Qing-SongCAI Qing-Song
School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
WU Jin-DiWU Jin-Di
School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site
BAI Chen-YuBAI Chen-Yu
School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China
Find this author on All Journals
Find this author on BaiDu
Search for this author on this site

                    
Affiliation:
Clc Number:
Fund Project:

Article

Figures

Metrics

Reference [16]

Cited by

Materials

Comments

Abstract:

Artificial intelligence accelerates the development of the risk control industry. Undoubtedly, risk control is the core of intelligent risk control, and a credit default prediction model is its essential means. The traditional access to risk control is based on artificial and generalized linear models. However, the data of transactions completed on the Internet are characterized by high dimensions and multiple sources, which cannot be processed by existing models. This poses a great challenge to traditional risk control. In view of this, this study proposes an interpretable credit default model based on the fusion method. To be specific, the accuracy of the prediction results is first enhanced through the fusion of base models (LightGBM, DeepFM, and CatBoost) and secondary model (CatBoost). Then, the prediction result of the fusion model is interpreted by the introduced local-based interpretability method LIME that is independent of the model. According to the experimental result of a real dataset, the satisfactory accuracy and interpretability of the model can be witnessed on the task of credit default prediction.

Key words:financial risk control;prediction of default;credit risk;interpretability;integration model

Reference

[1] 程大伟, 牛志彬, 张丽清. 大规模不均衡担保网络贷款的风险研究. 计算机学报, 2020, 43(4): 668–682. [doi: 10.11897/SP.J.1016.2020.00668

[2] 庞素琳. 违约风险下的信贷决策模型与机制. 管理科学学报, 2012, 15(4): 58–70. [doi: 10.3969/j.issn.1007-9807.2012.04.008

[3] 韦璠, 宋云飞, 邵明莉, 等. 利用特征融合和整体多样性提升单模型鲁棒性. 软件学报, 2020, 31(9): 2756–2769. [doi: 10.13328/j.cnki.jos.005943

[4] Chen JD, Tao Y, Wang HR, et al. Big data based fraud risk management at Alibaba. The Journal of Finance and Data Science, 2015, 1(1): 1–10. [doi: 10.1016/j.jfds.2015.03.001

[5] 章宁, 陈钦. 基于AUC及Q统计值的集成学习训练方法. 计算机应用, 2019, 39(4): 935–939

[6] Deng TN. Study of the prediction of micro-loan default based on logit model. 2019 International Conference on Economic Management and Model Engineering (ICEMME). Malacca: IEEE, 2019. 260–264.

[7] Kim A, Cho SB. An ensemble semi-supervised learning method for predicting defaults in social lending. Engineering Applications of Artificial Intelligence, 2019, 81: 193–199. [doi: 10.1016/j.engappai.2019.02.014

[8] 魏力, 王子炫. 结合标签规则的P2P网贷风控模型. 计算机与数字工程, 2020, 48(7): 1687–1692

[9] Tong ENC, Mues C, Thomas L. A zero-adjusted gamma model for mortgage loan loss given default. International Journal of Forecasting, 2013, 29(4): 548–562. [doi: 10.1016/j.ijforecast.2013.03.003

[10] 马晓君, 宋嫣琦, 常百舒, 等. 基于CatBoost算法的P2P违约预测模型应用研究. 统计与信息论坛, 2020, 35(7): 9–17. [doi: 10.3969/j.issn.1007-3116.2020.07.002

[11] Ma XJ, Sha JL, Wang DH, et al. Study on a prediction of P2P network loan default based on the machine learning LightGBM and XGboost algorithms according to different high dimensional data cleaning. Electronic Commerce Research and Applications, 2018, 31: 24–39. [doi: 10.1016/j.elerap.2018.08.002

[12] 盛杰, 刘岳, 尹成语. 基于多特征和Stacking算法的Android恶意软件检测方法. 计算机系统应用, 2018, 27(2): 197–201. [doi: 10.3969/j.issn.1003-3254.2018.02.033

[13] 徐磊, 孙朝云, 李伟, 等. 基于SSA-LightGBM的交通流量调查数据趋势预测. 计算机系统应用, 2021, 30(1): 243–249. [doi: 10.15888/j.cnki.csa.007750

[14] 王美, 龙华, 邵玉斌, 等. 基于FM与DeepFM模型对GTD特征的研究. 通信技术, 2019, 52(6): 1495–1499. [doi: 10.3969/j.issn.1002-0802.2019.06.033

[15] 党存禄, 武文成, 李超锋, 等. 基于CatBoost算法的电力短期负荷预测研究. 电气工程学报, 2020, 15(1): 76–82

[16] Ribeiro MT, Singh S, Guestrin C. “Why should I trust you?”: Explaining the predictions of any classifier. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016. 1135–1144.

Get Citation

蔡青松,吴金迪,白宸宇.基于可解释集成学习的信贷违约预测.计算机系统应用,2021,30(12):194-201

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:March 02,2021
Revised:March 29,2021
Adopted:
Online: December 10,2021
Published:

Article QR Code

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address：4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code：100190
Phone：010-62661041 Fax： Email：csa (a) iscas.ac.cn
Technical Support：Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063